Introduction

There is growing interest in the structure and functioning of coral reef ecosystems. This interest encompasses both intrinsic biogeographic questions and the need to understand, and manage, coral reef ecosystems as they become increasingly impacted by human exploitation and climate change (Kuffner and Toth 2016; Williams et al. 2019; Bellwood et al. 2024). Our ability to ask key biogeographic questions has been fuelled in recent years by a rapid increase in the availability of large-scale datasets, whether collected by specialists (Heenan et al. 2017; Emslie et al. 2020), by citizen scientists (e.g. Reef Life Survey) (Edgar and Stuart-Smith 2014; Edgar et al. 2020) or collated from various sources (e.g. Cinner et al. 2016, 2020). As a result, in recent decades, we have seen a rapid transformation in our understanding of the factors shaping reef fish assemblages. For example, there is increasing evidence of spatial structuring, from microhabitats (Depczynski and Bellwood 2004; Kane et al. 2009), through local-scale habitat variation (Russ 1984; Eurich et al. 2018; Oakley-Cogan et al. 2020), to regional-scale patterns and global distribution gradients (Cheal et al. 2013; Kulbicki et al. 2013; Siqueira et al. 2021; Pinheiro et al. 2023). Superimposed on this spatial variation is the potential for different approaches, depending on the metrics used to characterize communities, ranging from species presence–absence data (e.g. Parravicini et al. 2013), to abundances and community composition data (e.g. Connolly et al. 2005, 2017), trait-based ‘functional’ classifications (e.g. Mouillot et al. 2014; Hemingson and Bellwood 2018; McLean et al. 2021) and, finally, direct estimates of ecosystem functions (e.g. Tebbett et al. 2020).

As expected, in any ecological system there is variation at every conceivable spatial scale, and this is often expressed in both taxonomic and (inferred) functional contexts. However, in studying this variation there is a trade-off in terms of the scale of observations that one may want and the level of detail that can be obtained. Global-scale evaluations rarely have the level of detail seen in regional or small-scale analyses, where local-scale variability may be important. However, it is becoming increasingly important to understand patterns of distribution, abundance and functional capacity on coral reefs at ever increasing scales, to match the expanding scale of human impacts (Bellwood et al. 2019a; Williams et al. 2019). Nevertheless, this raises the question: are we sacrificing important ecological information in our search for global-scale patterns? To address this question, we specifically examine the extent to which the loss of information on local-scale variation in habitat use may compromise our understanding of global patterns of distribution, abundance and function.

Recent studies have shown the potential for sampling biases in coral reef survey methods and the apparent underrepresentation of major reef habitats, particularly the reef flat (Harborne 2013; Bellwood et al. 2018, 2020). Indeed, there appears to be a distinct habitat-specific division in reef studies, with GIS, geological and physiological research focusing on the reef flat, versus ecological, functional and fisheries research, and monitoring, predominantly on the reef slope (Bellwood et al. 2020; Lutzenkirchen et al. 2024). This potential for a habitat-specific division of research focus raises the question: to what extent could our understanding of global patterns of reef fish communities be dependent on the habitats sampled? Indeed, is there greater variation in reef fish assemblages across broad biogeographic gradients, or habitat gradients? To explore these questions, we utilized a reef fish dataset that spans 12,000 km from the Cocos (Keeling) Islands to French Polynesia, covering the world’s largest tropical biodiversity gradient in reef fishes. However, these reef fish data also included consistent high-resolution habitat-specific surveys at every sampling location. By specifically incorporating habitat associations into the surveys, it is possible to disentangle the relative contribution of habitat associations from broader, biogeographic, influences on the structure of fish communities across large spatial scales.

Materials and methods

We used a large-scale series of fish surveys that span the Indo-Pacific. Designed to survey reefs at a global scale and encompassing the major longitudinal biodiversity gradient, this dataset also specifically sampled three key habitats in each region (Connolly et al. 2005, 2017; Morais et al. 2020). Between 1998 and 2010, nine geographic regions, spanning 12,000 km, were sampled from the Cocos (Keeling) Islands in the Indian Ocean, through the Indo-Australian Archipelago and across the Pacific to French Polynesia (Fig. 1a). In each region, two reefs/islands were sampled (except one in Cocos Keeling and Rowley Shoals, and three on the Great Barrier Reef), with three habitats examined on each reef: the reef flat which lay at approximately chart datum (± 1 m) and at least 5 m behind the reef crest, the reef crest at 0–3 m which marks the transition between the flat and slope, and the slope at 8–12 m (Fig. 1b). Within each habitat there were four replicate transects (see Table S1 for replicate numbers in each region). Where possible, the four transects in each of the different habitats were adjacent and spatially aligned along the same reef parallel with the reef crest; all transects were in moderately exposed areas.

Fig. 1
figure 1

a The nine biogeographic regions where reef fishes from the family Labridae (including parrotfishes) were surveyed. In each region we surveyed two reefs/islands (except one at Cocos and Rowley and three on the GBR), three habitats per reef, and four transects within each habitat at each reef. b Schematic depiction of the three key habitats surveyed. The labrids c Novaculichthys taeniourus and d Coris gaimard (photographs J.P. Krajewski). GBR = Great Barrier Reef, PNG = Papua New Guinea

The transects (fish surveys) were restricted to the family Labridae (including parrotfishes), as a representative coral reef fish family that encompasses the largest possible range of sizes, morphologies, and trophic categories (cf. Wainwright et al. 2004; Floeter et al. 2018) (Fig. 1c, d). By focusing on just one family we could maintain consistent swimming speeds, ensuring all individuals were counted while minimizing over-counting (Ward-Paige et al. 2010). On each transect, fishes were surveyed using 20-min timed swims (with each replicate transect/survey being approximately 235 m long [Bellwood and Wainwright 2001]) which were specifically designed to minimize diver effects (Dickens et al. 2011; Emslie et al. 2018). Each survey was based on a size-stratified sampling design to maximize detection accuracy: one diver counted large potentially diver-averse fishes > 10 cm total length, while the other diver focused solely on smaller species 2.5–10 cm. All fish were placed in size categories. All > 10 cm counts were by the same diver (DRB). Transect widths were 5 m and 1 m, for fish > 10 and < 10 cm, respectively, with data subsequently standardized for differences in transect width prior to analysis.

The fish assemblages were analysed based on species-level taxonomy and traits. For the latter, we used the six categorical traits (i.e. size, mobility, period of activity, schooling behaviour, vertical position in the water column, and diet) and trait levels (see Tables S2 and S3 for an overview of trait categories and species-level trait designations) that are commonly used in such analyses (e.g. Mouillot et al. 2014; D’agata et al. 2016; Richardson et al. 2018; Denis et al. 2019; Dubuc et al. 2023; Olán-González et al. 2023). Although the Labridae have some of the most well documented traits that are causally related to a range of trophic and locomotor functions (Bellwood and Choat 1990; Fulton et al. 2001, 2017; Wainwright et al. 2002, 2004), we used the widely applied categorical traits to evaluate the relative merit of this general approach. Trait level categorization for each species were based on published literature (e.g. Bellwood et al. 2006), books (Randall et al. 1997; Randall 2005), Fishbase (Froese and Pauly 2024), and consensus expert opinion. Based on this trait classification, fish trait entities were defined (i.e. unique trait combinations) following Mouillot et al. (2014).

The two community composition datasets, a) species-level taxonomic abundances, and b) the abundance of trait entities, were fourth-root transformed and converted to Bray–Curtis dissimilarity matrices. This transformation was necessary due to the presence of a few highly abundant species (a common feature of ecological count data), with this transformation representing one of intermediate strength that helps balance the contribution of common versus rare taxa (Clarke 1993). The Bray–Curtis coefficient is widely applied and was chosen as it represents one of the most reliable performing dissimilarity measures for ecological count data (Clarke 1993). Based on these dissimilarity matrices the arrangement of community data were initially visualized in multivariate space using unconstrained ordinations (i.e. principal coordinate analysis [PCoA]). Scree plots suggested that plotting the first two axes was sufficient in both cases, with relatively little extra information gained from plotting further axes. Following this, we tested for significant groupings among habitats and biogeographic regions in both datasets using permutational analysis of variance (PERMANOVA) with 9999 permutations. Homogeneity of dispersions was also examined among habitats and regions using permutational multivariate analysis of dispersions (PERMDISP). We then visualized the results of these analyses using distance-based redundancy analysis (dbRDA). Specifically, the dbRDA analyses were constrained by biogeographic region and habitat to explore where the major axes of variation occurred through the community data in respect to these two factors. To assist with visualization of how variation was partitioned across major axes in each dataset, we produced two separate ordinations for both datasets, one coloured/labelled by habitats and one coloured/labelled by biogeographic regions. All multivariate analyses were conducted using the vegan package (Oksanen et al. 2019) in R (version 4.2.2; R Core Team 2022).

Results

Across the 204 transects surveyed, 135 species of labrid were recorded. A PERMANOVA revealed that the taxonomic structure of the fish community differed significantly both among biogeographic regions (F8,193 = 13.241, p < 0.001) and among habitats (F2,193 = 47.341, p < 0.001), while PERMDISPs also suggested that dispersions significantly differed among regions (F8,195 = 5.906, p < 0.001) and habitats (F2,201 = 5.632, p < 0.01). However, the unconstrained PCoA and the constrained dbRDA ordinations, both revealed that the major axis of variation in fish assemblages lay among habitats, rather than among biogeographic regions (Figs. 2, S1). Specifically, there was clear variation in reef fish assemblages among slope, crest, and flat habitats, with this variation largely aligning with dbRDA Axis 1 (which accounted for 22.5% of the total variation in the dataset and 44.1% of the variation explained by the two constraining factors) (Fig. 2a). By contrast, variation in fish assemblages across biogeographic regions largely aligned with dbRDA Axis 2, which accounted for less than half the variation (just 9.3% and 18.3% of total and explained variation, respectively) than that explained along dbRDA Axis 1 (Fig. 2b). Indeed, there was a high degree of overlap for most biogeographic regions in this multivariate ordination space, with the main trend being a series of overlapping distributions extending from high diversity regions (Indonesia and Papua New Guinea) to the low diversity regions (French Polynesia and Cocos [Keeling]) (Fig. 2b) on Axis 2, while maintaining the slope-crest-flat distinction on Axis 1.

Fig. 2
figure 2

The composition of labrid fish assemblages across the Indo-Pacific: taxonomic structure. Multivariate ordination based on a distance-based redundancy analysis (dbRDA) of reef fish surveys (individual points) and constrained by habitat and region. a Shows the points coloured/labelled by habitat, while b shows the points coloured/labelled by region. Note the strong separation along dbRDA axis 1 regardless of the biogeographic region. GBR = Great Barrier Reef, PNG = Papua New Guinea

The orthogonal nature of the two trends (i.e. habitat differences vs. biogeographic regions) in multivariate space highlights the fact that habitats divide communities in a similar manner across all communities in all regions (in both constrained and unconstrained multivariate space [Figs. 2, S1]). Each region has a distinctive slope, crest, and flat fauna but these faunas are largely replicated across the Indo-Pacific. Thus, within the slope community space there is a gradient of sites based on diversity that is likewise replicated in crest and flat communities. In effect, reef slopes are more similar to reef slopes 12,000 km away than they are to adjacent reef flats on the same reef just 20 to 30 m away. Habitat influence, rather than biogeographic region, represents the major axis of variation in the taxonomic structure of reef fish faunas.

When common traits were used to define trait entities within fish assemblages, the 135 species of labrid condensed into 32 unique trait entities (Fig. S2), but the patterns in community structure changed only slightly (Figs. 3, S3). Again, a PERMANOVA revealed significant differences both among biogeographic regions (F8,193 = 9.278, p < 0.001) and among habitats (F2,193 = 40.006, p < 0.001), while PERMDISPs suggested that dispersions significantly differed among regions (F8,195 = 2.581, p < 0.05), but not among habitats in this case (F2,201 = 2.609, p = 0.076). In respect to the ordinations, the unconstrained PCoA and the constrained dbRDA ordinations again also revealed that the major axis of variation lay among habitats, rather than among biogeographic regions (Figs. 3, S3). Specifically, variation across habitats still largely aligned with dbRDA Axis 1, and the trait composition of fishes on the slope clearly separated along this axis, however, there was a higher degree of overlap between fish faunas in crest and flat habitats (Fig. 3a). Likewise, when transects are considered in a biogeographic context the overlap in the trait composition of the fish faunas was even more extensive than for taxonomic composition (Figs. 2b, 3b). This suggests that: a) habitat differences, not biogeographic regional differences, represent the major axis of variation in the trait composition of the fish fauna, and b) that taxonomy is far more sensitive in separating habitat and biogeographic associations than commonly used traits.

Fig. 3
figure 3

The composition of labrid fish assemblages across the Indo-Pacific: trait structure. Multivariate ordination based on a distance-based redundancy analysis (dbRDA) of reef fish surveys (individual points) and constrained by habitat and region. a Shows the points coloured/labelled by habitat, while b shows the points coloured/labelled by region. Note the weaker separation compared to the taxonomic structure in Fig. 2 although traits still show stronger separation among habitats rather than among biogeographic regions along dbRDA 1. GBR = Great Barrier Reef, PNG = Papua New Guinea

Discussion

The importance of habitats

The most compelling result in this study is the overwhelming impact of habitat on reef fish community structure. Fish communities on the flat and slope are typically separated by just 20 to 30 m. They share, in all likelihood, a broadly similar supply of recruits, and are exposed to roughly the same temperatures, weather, and potentially the same predators. They lie within hearing and, likely, smelling distance of each other. Yet they are fundamentally different, in terms of both taxonomy and trait composition. These cross-habitat differences appear to be greater than the differences between fish assemblages separated by 12,000 km, in different oceans, which includes species that have been separated from each other for tens of thousands, if not millions, of years (Choat et al. 2012; Cowman and Bellwood 2013). The fishes in these biogeographically separate assemblages share no common currents, food supplies or weather patterns. Yet, within a given habitat, one can find a similar suite of species, and traits, to those found in communities in the same habitat 12,000 km away.

The marked habitat differences in fish community structure tell us a great deal about coral reefs. Local physical factors are overwhelmingly important in shaping communities; by comparison, region or biogeographic patterns may be less important (also see Alevizon et al. 1985; Bradley et al. 2022). This is most clearly shown in the dbRDA based on taxonomic composition, where habitat is separated along Axis 1 (44.1% of explained variation) while biogeographic regions fall along Axis 2 (18.3% of explained variation). Thus, despite surveying the world’s largest biodiversity gradient on coral reefs (Hughes et al. 2002; Cowman et al. 2017), local conditions overwhelmed the biogeographic signal. The regional fauna dictates what species are available, the habitat dictates what species are present on a given reef. Given the concordance seen in previous studies, it is likely that this pattern is replicated in both fishes and corals (Connolly et al. 2005) and for fishes it may also apply to other tropical marine ecosystems, including seagrass beds and mangroves (Hemingson and Bellwood 2016; Bradley et al. 2022). It is like human habitation: a kitchen will contain pots and knives, and a bedroom somewhere to sleep, no matter where you live in the world. However, this rather prosaic analogy has far-reaching implications when it comes to constructing datasets for biogeographic analyses.

Global datasets are increasingly in demand. But global surveys by a single researcher are rare. Global datasets, therefore, often rely on concatenated data (e.g. Cinner et al. 2016, 2020). But if the component surveys were undertaken in slightly different habitats, or using different methods, it may have a profound impact on subsequent data, analyses, and interpretations (cf. Dickens et al. 2011; Emslie et al. 2018; Bruneel et al. 2022). This means that in our evaluation of global patterns, local patterns must be incorporated from the outset. If we do not explicitly account for habitat, specifying depth and location to within a few metres, then the data used to construct global patterns of biogeography, trait structure, or fishing exploitation are likely to be fundamentally compromised.

Going forwards, we may need to change our approach to surveys and global-scale analyses. Our unconscious biases and failure to adequately survey key habitats (e.g. the reef flat) has stymied our understanding of reef systems (Bellwood et al. 2020). Practical limitations mean that it is often not possible to survey multiple habitats at large scales (Lutzenkirchen et al. 2024). However, a partial picture is possible, but only if care is taken when identifying habitats and standardizing for them in analyses. Global comparisons may benefit from the identification of a standard set of habitat choices, to facilitate comparisons in a common environmental context (e.g. semi-exposed slopes between 5–10 m). By far the most useful, however, would be a standard protocol with a GPS record of the start location, habitat, depth, and direction of each transect, which would facilitate the inclusion of the necessary factors in subsequent analyses. Without such contextual data, especially habitat type and depth, existing survey data are of limited value. Ad hoc surveys without the relevant metadata are perhaps best discarded or at least excluded from concatenated global analyses. It is inappropriate to describe global biogeographic patterns in abundances, ecological characteristics, or functions unless the overriding impact of habitat variation can be demonstrably accounted for in the analyses.

Taxonomy versus traits

It is often reported that traits offer far more nuanced and informative insights when trying to understand changes in fish assemblages (Mouillot et al. 2013; Villéger et al. 2017; Martini et al. 2021). This assumes that traits are indicative of a fish’s ability to interact with the environment, for example in terms of food acquisition, mobility, mortality risk, or in the delivery of ecosystem functions. There is evidence to support this contention with traits revealing functional constraints (Wainwright 1988; Bellwood and Wainwright 2001; Bejarano et al. 2017; Fulton et al. 2017). However, this assumption has been questioned, with calls to re-evaluate the information content of traits and to determine the extent to which they can be named, regarded as, or used as ‘functional’ traits, i.e. traits that indicate functional abilities (reviewed in Mlambo 2014; Bellwood et al. 2019b; Streit and Bellwood 2023a). Traits may be useful indicators of function, but the extent can vary widely; it all depends on the functional information content of the traits (Streit and Bellwood 2023b).

This principle applies herein. While the crest and flat differed markedly in taxonomic space, in ‘functional’ or more accurately trait space, the separations were far less clear. This suggests that while species may have strong habitat preferences, the traits (at least based on commonly used broad categorical traits) that they possess may overlap, clouding species-specific habitat associations. This may indicate that similar functions are delivered if the traits can be shown to be functionally informative (which most are not to any great extent) (Bellwood et al. 2019b). Or it may simply indicate that the traits and inferred functional designations are just another type of crude taxonomy (cf. Mlambo 2014; Streit and Bellwood 2023a), offering little improvement on taxonomic evaluations. Indeed, most species were traditionally identified, and separated, based on morphological features (i.e. traits) (e.g. Randall 1955). These features are often at a finer scale than the traits used in traditional trait/functional analyses. Therefore, unless specifically selected, traits may indeed just be a crude form of taxonomy (Streit and Bellwood 2023a, b).

Habitat-related differences in functional traits have been described previously in labrids, mainly relating to their swimming abilities (e.g. Fulton et al. 2001). Indeed, this pattern does appear to transcend biogeographic region: reef crests are invariably dominated by Thalassoma fishes with high aspect ratio fins (Green 1996; Bellwood and Wainwright 2001; Fulton et al. 2017). In this particular example, the trait, function and ecological implications are strongly, and causally, related. For most of the traits used herein, and elsewhere, the links are weaker (reviewed in Bellwood et al. 2019b; Streit and Bellwood 2023a). Thus, while some traits, and their associated functional interpretations, may help us to understand patterns, they do not necessarily provide the details needed to significantly increase our understanding of observed patterns.

One of the issues with traits centres around the question: are traits (and trait combinations) superior to taxonomy in describing patterns? In the present example, the answer is probably no. The taxonomic evaluation provided a much stronger delineation between locations in terms of both habitats and biogeographic regions. Thus, in the identification of spatial patterns a taxonomic approach may be far more robust—as it can reveal subtle variation within groups that are unified by traits. The ‘community cluster traits’ (sensu Streit and Bellwood 2023a) applied herein are, in effect, just a crude, and potentially inferior, form of taxonomy. To describe communities, taxonomy may be a far more robust approach in some cases, and in species with distinct traits, e.g. Thalassoma spp., this may also convey important ecological and functional information (Fulton et al. 2017). If, however, the ecological characteristics of species are poorly known, or if analyses are being conducted across scales that share few, if any, species then traits may offer a new perspective as they can provide a common language that transcends taxonomy (Steneck and Dethier 1994; Hemingson and Bellwood 2018; McLean et al. 2021; Brandl et al. 2023). Furthermore, if the traits are indeed functional traits with a demonstrable causal link to a specified function (i.e. ‘rate traits’ sensu Streit and Bellwood 2023a) then the interpretation of traits offers great promise. But for the common traits used herein, their capacity to infer functional abilities may offer little more information than that available based on a simple evaluation of the component species (Fig. S2). In the trade-off between increased functional knowledge versus reduced spatial separation, the cost and benefits may need to be carefully considered.

Future considerations

As studies upscale to address larger-scale questions, reflecting the scale of environmental challenges (Hughes et al. 2018; Bellwood et al. 2019a), we may need to take care that smaller-scale processes are not lost. Coral reefs offer key insights into this phenomenon. Coral reefs are effectively spatially replicated isolated ecosystems. Each has a distinct series of specific habitats that are, like most other attributes of reefs, shaped by hydrodynamics (Lowe and Falter 2015; Aston et al. 2019). The slope-crest-flat separation is primarily driven by wave energy (Done 1983; Graus and Macintyre 1989). This shapes the distribution of almost all reef organisms from corals (Goreau 1959; Dollar 1982; Done 1983) to fishes (Fulton and Bellwood 2004, 2005; Bejarano et al. 2017). It also shapes geological processes such as sediment accumulation (Tebbett et al. 2023) and particle size distributions (Purcell 2000). Furthermore, this overwhelming physical structuring of reefs accounts for the universal similarity of reef processes-waves are universal attributes of reefs responsible for establishing the fundamental structures (Kench and Brander 2006; Hopley et al. 2007; Duce et al. 2020; Lutzenkirchen et al. 2023). Superimposed on this structure, regional physical factors (e.g. tidal regimes; Barnes et al. 2012; Harborne 2013; Lowe and Falter 2015; Bradley et al. 2022) can shape the exact nature of emergent structures and their ecology, while evolutionary processes can shape the available species pool (Floeter et al. 2008; Cowman et al. 2017; Siqueira et al. 2019). Yet none of these latter factors can override the universal impact of wind and waves in structuring fish communities across habitats. A pattern that is likely to apply to most reef organisms. Ultimately, each reef is a wave-structured ecosystem with habitat-specific taxonomic and trait assemblages that, based on the evidence herein, are replicated repeatedly across approximately half the global tropics. In an era of global datasets and high-impact interpretations, key insights may arise from the wisdom that comes from understanding the importance of waves.