Introduction

The observation that bone responds to the mechanical loading through altering its structural composition is an adaptive phenomenon commonly referred to as "Wolff’s Law" (Wolff, 1892). This biological reaction to mechanical strain is the conceptual basis for studies that use trabecular microarchitecture to infer the frequency and directionality of loading. Specifically, it maintains that trabeculae respond to habitual loading by adjusting to a more anisotropic microarchitecture, thereby efficiently improving stiffness via aligning to the main stress trajectories without adding additional mass (e.g., Galante et al., 1970; Venieratos et al., 1987). As such, trabecular microarchitecture has been used to ascertain skeletal functional adaptation, with common application to the study of mastication and diet (e.g., Kupczik et al., 2009; Ryan et al., 2010) and locomotion (e.g., Fajardo & Müller, 2001; Martinón-Torres, 2003; Rook et al., 1999; Ryan & Ketcham, 2002, 2005; Su et al., 2013; Tsegai et al., 2013).

Previous empirical studies have confirmed the functional role of certain trabecular parameters, especially the degree of anisotropy (DA) and bone volume fraction (BV/TV), due to their collective contribution to around 98% of trabecular bone’s overall elastic modulus (Maquer et al., 2015; Stauber et al., 2006). Furthermore, an assortment of in vivo experimental studies has provided convincing evidence that trabecular arrangements are sensitive enough to detect subtle changes in both load magnitude and orientation related to mechanical loading induced within controlled settings (guinea fowl: Pontzer et al., 2006; sheep: Barak et al., 2011; dog: Goldstein et al., 1991; rabbit: van der Meulen et al., 2006). Trabecular bone analyses therefore remain appealing because of their argued ability to capture subtle structural adaptations required to avoid mechanical failure under repetitive loading. Yet, some attempts to infer locomotor function from trabecular bone, even within controlled experimental settings, have proven unsuccessful (e.g., Carlson et al., 2008; Swartz et al., 1998) or have yielded inconsistent results that limit their overall interpretability (see review in Kivell, 2016).

In primates, trabecular bone characteristics observed in functionally relevant regions such as the limbs have been extensively studied; specifically, the femoral head and neck (Georgiou et al., 2018, 2020; Haeusler et al., 2020; MacLatchy & Müller, 2002; Ryan & Walker, 2010; Ryan & Ketcham, 2005), the subarticular bone of the humerus and femur (Rafferty & Ruff, 1994), and isolated aspects of the proximal femur (Saparin et al., 2011) and humerus (Scherf et al., 2013, 2016). In some instances, these efforts have demonstrated the capacity to effectively distinguish between preassigned locomotor categories and have subsequently been applied to hominin fossil material, including Australopithecus africanus (distal tibia: Barak et al., 2013b; talus: DeSilva & Devlin, 2012; metacarpals: Skinner et al., 2015; femur: Georgiou et al., 2020; hipbone: Macchiarelli et al., 1999) and Paranthropus boisei (talus: Su et al., 2013). Unfortunately, in mammalian groups outside of primates, trabecular work is limited despite the remarkable locomotor diversity and previous successful attempts to detect a functional signal related explicitly to habitual behaviors (Sciuromorpha: Mielke et al., 2018; Xenarthra: Amson et al., 2017; Odontoceti: Rolvien et al., 2017).

Trabecular bone has also proven to be useful in explicating informative scaling relationships in both primates, and in mammals more generally, elucidating the mechanisms behind how the skeleton responds to body mass increases across vertebrates. Former studies on primate limbs propose that features of the trabeculae scale allometrically but are also significantly influenced by phylogeny (Ryan & Shaw, 2013). Further, a study focusing exclusively on mice, rats and humans demonstrated that trabecular number and thickness scale differently in humans versus rodents, suggesting that these particular metrics can contribute to overall bone volume differently depending on the taxonomic sample used (Barak et al., 2013a). As such, careful consideration should be given not only to locomotor behavior and joint features, but also to the implications of size and phylogeny on internal bone architecture, as there is a poor understanding of how these variables interact to influence the resultant trabecular morphology.

The relationship between body mass and trabeculae distribution and/or density within the pelvis of bipedal vs. quadrupedal and arboreal vs. terrestrial mammals is also comparably understudied. Despite previous work that focused explicitly on scaling trends within internal trabecular structure across various mammalian groups (Barak et al., 2013a; Doube et al., 2011, 2012; Fajardo et al., 2013; Mullender et al., 1996), few studies examine these features within the pelvis, despite its functional relevance. In fact, the majority of trabecular work conducted on the pelvis is concentrated primarily on human samples and usually within clinical contexts (Cunningham & Black, 2009; Dalstra & Huiskes, 1995; Dalstra et al., 1993; Wade et al., 2011). However, those former studies that utilize the pelvis to infer locomotor behavior within human ancestors, although limited in number, have proven to be informative by revealing distinct trabecular features possibly related to our unique form of striding bipedalism (e.g., presence of a discrete ilioischial bundle, an undivided sacropubic bundle and a full diagonal crossing over the acetabulum vicinity) that can arguably be used to facilitate the recognition of bipedalism in other hominin taxa (Macchiarelli et al., 1999). Studies of human pelvic cancellous bone therefore offer a valuable comparative framework to investigate locomotion in fossil taxa and has already enabled exploration into the potential locomotor behavior of Australopithecus africanus (Macchiarelli et al., 1999), Homo neanderthalensis (Martinón-Torres, 2003) and the Miocene ape species Oreopithecus bambolii (Rook et al., 1999).

Due to the informative potential of trabecular microarchitecture, further work is needed to appreciate the functional role of trabecular bone within the pelvis, especially in comparative contexts that can utilize the impressive extent of size and locomotor diversity found across mammals. Accordingly, this study seeks to test whether similar functional signals related to locomotor behavior can be identified among extant marsupials, primates, rodents and treeshrews and whether these groups collectively, and independently, conform to any consistent scaling patterns that can shed light on important trabecular characteristics required for accurately evaluating functional adaptation. The taxa included in this study were selected based on their comparable body sizes and convergent locomotor behaviors to enable direct comparisons across different taxonomic groups as an attempt isolate any functional signal reflected in the trabecular bone. The inclusion of species with highly derived locomotor behaviors also provides an opportunity to assess whether locomotor-based adaptation ultimately supersedes the effects of body size and phylogenetic constraints.

This study’s objectives are to (1) evaluate if bipedal taxa differ from their quadrupedal counterparts by altering bone density to support the increased load bearing associated with two limb, or hindlimb dominated, forms of locomotion, (2) determine if trabecular orientation corresponds with commonly used locomotor modes, habitat and positional preferences across a sample of diverse taxa and body size ranges thereby asserting its utility as a methodological approach and (3) investigate allometric trends that can identify important size-related influences governing trabecular microarchitecture characteristics.

Materials and Methods

The study sample consists of a total of 165 individuals comprised of metatherian (n = 43) and eutherian (n = 122) mammals, including non-human and human primates (n = 91), marsupials (n = 42), rodents (n = 23) and treeshrews (n = 9) (Table 1; Online Resource 1). Only adult specimens that were free of pathological conditions and injury were included in the analyses. Specimen age was determined via visual inspection of epiphyseal fusion of associated postcranial elements. The majority of specimens were housed at the American Museum of Natural History (AMNH) Mammalogy collections while the Homo sapiens sample consisted of modern human remains housed in the Division of Anthropology at the AMNH that were obtained through medical dissection facilities during the mid-twentieth century (see Online Resource 1 for additional demographic details). The Ptilocercus lowii specimens were from the Museum of Comparative Zoology at Harvard University (MCZ), the Field Museum of Natural History (FMNH) and the Peabody Museum of Natural History at Yale University (YPM). The Ptilocercus lowii material was scanned independently at the Department of Internal Medicine at Yale University (Online Resource 1).

Table 1 Taxonomic assignations, preassigned locomotor classification modes, species means and their relative ranges for bone volume fraction (BV/TV), degree of anisotropy (DA), trabecular thickness (Tb.Th) and the calculated relative resolutions for the entire sample

Computed Tomography Imaging

The majority of specimens were scanned at the American Museum of Natural History’s Microscopy and Imaging Facility using a GE PHOENIX v|tome|x s240 micro-CT scanner, while the Ptilocercus specimens (MCZ 51,736, FMNH 76,855, YPM 6873) were scanned independently using a Scanco Medical µCT 35 machine at Yale University at a nominal resolution, as reflected by voxel size (0.0185 mm), of 18.5 µm. The nominal resolution range for the remaining scans is between 12.5 and 117 µm (voxel size: 0.0125–0.117 mm) due to the physical size discrepancies between the included taxonomic groups. Prior to scanning, each specimen was mounted and secured within the µCT machine in a vertical position so that the acetabulum and the area immediately surrounding it were centered in anatomical position thus maximizing the spatial resolution for each specimen for this region of interest. Given the importance of scan voxel size to some of the trabecular morphometric parameters, a relative resolution range was also calculated (Kothari et al., 1998; Sode et al., 2008) (Table 1). The relative resolution permits direct comparison across taxa that vary considerably in body size and scan resolution, and it is calculated by dividing the trabecular thickness by the image resolution (Sode et al., 2008). The sample of this study encompasses calculated relative resolution ranges between 1.6 and 9.9.

VOI Preparation

A single spherical volume of interest (VOI) was digitally extracted from the superior aspect of acetabular dome of each specimen and oriented to the same coordinate system using Visual Graphics Studio Max 2.2 (Volume Graphics GmbH, Heidelberg, Germany). VOI spheres were scaled to a radius that equaled twenty percent of the maximum superoinferior (SI) acetabulum height which was digitally measured via the polyline tool in VG Studio Max 2.1 (Fig. 1a). Scaled VOIs were used to avoid introducing an oversampling bias (see Lazenby et al., 2011). SI acetabulum height was used as an indirect proxy for body size as hip joint diameter was found to correlate with femoral head diameter, and consequently with body size, in primates (Hammond et al., 2013). Specifically, this measure strongly correlates with linear dimensions of the femoral head which are well-established body size proxies (Jungers, 1991) effectively used in previous studies, including those based on trabecular bone scaling (e.g., Doube et al., 2011; Ryan & Shaw, 2013). The mean SI acetabulum height for 37 of the species included in this study were also regressed against mean mass estimates extracted from the literature to further affirm its suitability as a body size proxy (R2 = 0.97; Online Resource 1).

Fig. 1
figure 1

A virtual depiction of the various steps involved in extracting the homologous volume of interest (VOI) included in this study showing a the polyline tool measuring the superoinferior height of the acetabulum used to scale each VOI, b a lateral view of the area of interest to show the general alignment scheme and the position of fitted planes, c the lateral view of the same fitted planes used to standardize the VOI placement and the VOI in situ with transparency set to 60% and d the resultant VOI and the subsequent filtering steps used to generate the binarized stacks used in calculating the mean intercept length

The VOI used in this study was selected because it is adjacent to the hip joint and therefore within close proximity to an articular surface. Articular surfaces are believed to more reliably reflect and predict principal strains due to their active role in transferring force from the joint surface to the surrounding cortical bone (Oftadeh et al., 2015). The VOI selected for this study is one of the few regions in the pelvis conducive to capturing an adequate sample of trabeculae that also likely preserves a functional signal since the anterior margin of the superior acetabulum is thicker than most other regions, suggesting loading in both bipedal and quadrupedal taxa (e.g., Volpato et al., 2008). To ensure homologous placement of the VOIs, the isolated hipbones were oriented similarly so that the ilium and ischium aligned in the posterior view (Fig. 1b and c). A plane was subsequently fitted forming a continuous line at the most mediolateral projecting points immediately above the superior dome of the acetabulum. A plane transecting the former at the center of the longitudinal axis of the acetabulum was introduced for all specimens to establish a consistent VOI alignment and orientation relative to the world coordinate system. The VOI was digitally extracted from the center of the first plane, which typically corresponds with the widest circumference of the lower ilium, an area also found to scale allometrically with locomotor mode in primates (Lewton, 2015) (Fig. 1c). This region also permits clear visual differentiation between trabecular and cortical bone.

The VOIs were exported as three-dimensional aligned 16-bit TIFF image stacks and converted to binarized 8-bit RAW stacks using the Isodata-based iterative thresholding algorithm outlined in Ridler and Calvard (1978) (Fig. 1d). This approach calculates a composite average of the pixels that represent both the background (air space) and the object of interest (bone) to differentiate the two into binary threshold values that are incremented until a higher value than that of the original composite average is reached. It is included as a standard filtering option in the ImageJ software (ver.2.0.0-rc-67; Schneider et al., 2012) and was applied to the sample after digital segmentation of the VOIs and prior to the calculation of reported trabecular bone parameters.

Trabecular Parameters

After the segmentation and binarization process was completed the VOIs were uploaded back into Volume Graphics VGStudio Max 2.1 where the built-in trabecular morphometrics derived from the isosurface values were calculated for each specimen. The following trabecular morphometrics were used: BV/TV (bone volume fraction, or the ratio of bone volume to total volume), BS/BV (bone surface-area-to-volume ratio), Tb.Th (average trabecular thickness), Tb.N (mean number of trabecular structures per unit length) and Tb.Sp (mean distance between trabeculae). BS/BV, Tb.Th, Tb.N and Tb.Sp were used in the assessment of allometric trends while functional interpretations were drawn primarily from BV/TV and the calculated degree of anisotropy (DA) as these measures are more biomechanically informative (Odgaard, 1997).

The degree of anisotropy (DA) was calculated via the mean-intercept length (MIL) approach. This method was performed using the BoneJ plug-in (version 1.4.2; Doube et al., 2010) available for ImageJ (Schneider et al., 2012). MIL calculates the frequency of intersections between bone and background along several randomly placed vectors of the same length throughout a three-dimensional VOI (Harrigan & Mann, 1984; Odgaard et al., 1997). Then using the accumulation of the vector intercepts, a ratio of the major and minor axes is constructed via eigendecomposition rendering the axes lengths, as eigenvalues, and information regarding orientation, which is presented as eigenvectors. This is defined mathematically in the BoneJ software as 1 − smallest eigenvalue/largest eigenvalue. Accordingly, if no favored alignment of the trabeculae is present DA is equivalent to 0 whereas perfect alignment would result in a DA value of 1.

Phylogenetic Signal

In an attempt to explore any phylogenetic contribution influencing the trabecular parameters, a simulation-based phylogenetic ANOVA from Garland et al. (1993) available in the ‘phytools’ R package (Revell, 2012) was employed on the samples with sufficient taxa. This method provides p-values for the test statistic using simulation that assumes a Brownian motion process of evolution. For the primate sample, these phylogenetically informed analyses utilized a Bayesian-derived consensus tree downloaded from 10kTrees (Arnold et al., 2010; ver. 3; http://10ktrees.fas.harvard.edu) based on mitochondrial and autosomal genes available through GenBank. In instances, where data were not available for a particular species in the study sample, closely related taxa were supplemented (Online Resource 2, Fig. S1). Phylogenetic generalized-least squares regressions were also carried out using the aforementioned “phytools” package, along with the “ape” (Paradis et al., 2004) “geiger” (Harmon et al., 2008) and “nlme” (Pinheiro et al., 2018) packages in R following a methodological approach comparable to that of Ryan and Shaw (2013) to permit subsequent comparison to their work on the primate femur.

For the marsupial sample, a species-level supertree previously published in the literature was used after pruning it to reflect only the species available in this study’s sample (Cardillo et al., 2004) (Online Resource 2, Fig. S1). Once the tree was created for both the primate and marsupial sample, alternative trees were converted into ultrametric trees via “phytools” and assigned branch lengths derived from the rho transformation that was implemented using the “ape” package in R (Grafen, 1989). The “phytools” package was then used to perform the phylogenetic ANOVAs and subsequent post-hoc sequential Bonferroni corrections on the trabecular parameter values for the sample. Phylogenetic MANOVAs were conducted using the aforementioned ‘geiger” package. Phylogenetic signal in each variable was independently assessed using the Blomberg’s K-statistic (Blomberg et al., 2003) also available in the “phytools” package. Blomberg’s K measures the phylogenetic signal of biological traits as a scaled ratio of the variance among species over the contrasts variance; specifically, a value of 1.0 is anticipated for a trait evolving under a BM pattern of evolution while values closer to 0 denote the absence of a detectable phylogenetic influence.

Phylogenetically informed linear regressions (pGLS) and standard OLS regressions were compared using species means in both marsupials and primates to account for interspecific autocorrection due to phylogenetic influence. Each trabecular variable was log-transformed and subsequently regressed with log-transformed SI acetabulum height, the latter serving as a proxy for body size. The pGLS regressions utilized the maximum likelihood estimation of Pagel’s λ (lambda) (Pagel, 1999) that was applied via the “caper” R package (Orme et al., 2013), and was compared to traditional OLS regressions. The latter, which does not consider phylogenetic tree structure, assumes independence among species. Contrastingly, pGLS provides a lambda value, which is a branch length scaling parameter usually between 0 and 1 (though it can be slightly higher if phylogenetic signal is present and deviates from a BM pattern), and this value corresponds directly to the internal branch lengths of the tree. Accordingly, values closer to 0 indicate less of phylogenetic structure thereby approximating a “star phylogeny” once at 0, as the internal branch lengths are smaller indicating a lower phylogenetic signal. Values equal to 1 imply a BM pattern in which resemblances in the trait data reflect phylogenetic constraints that make it so that closely related species appear more similar. The predictive accuracy, or the model-fit, was assessed using the Akaike information criterion (AIC) with the lowest AIC values determining the best model fit. The rodent and treeshrew samples did not meet the methodological requirements, due to insufficient sample sizes, and were consequently excluded from any phylogenetic analyses. Further, any taxa without a definitive species allocation were only included in the allometric analyses since locomotor mode could not be reliably inferred.

Allometry

To evaluate general allometric trends across the entire sample, a series of reduced major axis (RMA) regressions were performed using the “smatr” R package (Warton et al., 2012). The use of RMA regressions is preferable as they are robust against bias introduced by arbitrary scale differences among variables. Each log-transformed trabecular morphometric was then regressed against SI acetabulum height to explore potential scaling relationships. Specifically, Tb.Th and Tb.Sp are linear measures so both would produce a slope of 1 in the case of isometry, with higher and lower slope values suggesting positive and negative allometry, respectively. BV/TV, DA and Tb. N are technically shape variables, thus if they are isometric, they produce a slope of 0 with higher values signifying positive allometry and lower ones implying negative allometry. BS/BV, a ratio of quantitative variables with different units, i.e., bone surface (mm2) and bone volume (mm3), results in a new size variable with the unit mm−1. Therefore, BS/BV would have a slope of -1 under isometry, with values above indicating instances of positive allometry and those below it, negative (e.g., Mielke et al., 2018; Ryan & Shaw, 2013).

Results

BV/TV

In terms of bone volume density, as measured by BV/TV, statistically supported differences exist across the entire sample for the preassigned categories though only for select locomotor modes (One-way ANOVA, p < 0.001; Table 2). In fact, subsequent Tukey HSD (Honest Significant Difference) tests and pairwise t-tests between all groups confirm that only some locomotor modes are distinguishable when adjusted p-values and confidence intervals are considered (Table 2; Online Resource 2, Fig. S2). For the complete sample, there are statistically supported mean differences recovered via pairwise t-tests with applied Holm-Bonferroni correction for arboreal quadrupeds (AQ) versus knuckle-walkers/terrestrial quadrupeds (KW/TQ), saltators and arboreal quadrupeds that also utilize climbing behaviors (AQ, climbers). This is also observed for vertical clingers and leapers (VCL) versus arboreal quadrupeds that climb (AQ, climbers) and saltators (Table 2). There were statistically supported differences across broad classifications like arboreal vs. terrestrial (t-test, t (77.2) =  − 2.79; p = 0.007), though not specifically for quadrupedal vs. bipedal taxa (t-test, t (125) = 1.08; p = 0.28) (Fig. 2). In fact, biped and quadruped BV/TV values are only statistically significant for the rodent sample (t-test, t (17.7) = 4.10, p < 0.001) (Fig. 3). Pronograde versus orthograde postural preference was not significant (One-way ANOVA, p = 0.09).

Table 2 Pairwise t-test results for BV/TV values for different locomotor modes for the complete sample
Fig. 2
figure 2

BV/TV plotted for assigned locomotor groups across the entire sample and by individual groups. AQ arboreal quadrupeds, AQ, C arboreal quadrupeds with climbing capabilities, B brachiators, Sa saltators, SB striding bipeds, Su suspensory, TQ terrestrial quadrupeds, KW/TQ knuckle-walkers and terrestrial quadrupeds, VCL vertical clingers and leapers. Postural preference is depicted using grey to represent species with orthograde trunk orientation and orange to depict species with a pronograde trunk alignment. Depicted box plots represent the interquartile range accounting for 50% of the data with box edges signifying the 25th percentile to the 75th percentile with the median pictured as a horizontal line residing within the box. The whiskers represent all the values within 1.5 times the interquartile range, unless the data does not exceed that range in which case the maximum and minimum values for the sample are used

Fig. 3
figure 3

BV/TV plotted for quadrupeds vs. bipeds for marsupials, primates and rodents with 3D VOI renderings of trabecular bone digitally extracted from representatives for each respective locomotor category. Depicted box plots represent the interquartile range accounting for 50% of the data with box edges signifying the 25th percentile to the 75th percentile with the median pictured as a horizontal line residing within the box. The whiskers represent all the values within 1.5 times the interquartile range, unless the data does not exceed that range, in which case the maximum and minimum values for the sample are used. Note the overall higher and similar BV/TV value ranges for the bipedal saltators in rodents and marsupials despite the large discrepancy in their body sizes. Only the rodent sample had statistically support differences between quadrupeds and bipeds (p = 0.00069). Humans are the only bipeds included in the primate sample and have slightly lower values compared to bipeds in other groups, though both marsupial and primate groups show lower BV/TV values compared to their quadrupedal counterparts

Locomotor categories correlated with BV/TV mean values within the primate sample even after phylogenetic correction (phylogenetic MANOVA F = 0.023, p = 0.039; Wilks’ statistic, p = 0.023). Interestingly, the primate biped category, represented solely by humans, reveals lower BV/TV values when compared to primate quadrupeds, contradicting the trend observed in rodents where quadrupeds tended to have the anticipated lower values relative to their bipedal counterparts (Fig. 3). These observations are counterintuitive as BV/TV should be higher in bipeds compared to quadrupedal equivalents since they load their joints with more relative force per unit of body mass. Across all locomotor groups the most variance is observed for terrestrial quadrupeds, while the VCL, arboreal quadruped and brachiator categories have the lowest density ranges across the sample when compared to the terrestrial taxa (Fig. 2). Rafferty and Ruff (1994) also reported lower density values in the femoral head for brachiators (i.e., Hylobates) which they argued possibly reflected their limited usage of the hindlimbs during locomotion. Primates evidence differences in BV/TV for arboreal and terrestrial groups (t-test, t (37.5) =  − 4.69, p =  < 0.001; phylogenetic ANOVA, F = 19.64 p < 0.001). This suggests that primates may retain a functional signal at least partly attributed to their exploitation of arboreal vs. terrestrial environments, though this is difficult to disentangle as terrestrial taxa are generally bigger in terms of body size. Positional preference, i.e., orthograde vs. pronograde trunk orientation, is not significant in primates once a phylogenetic correction is used (t-test, t (49.5) = 0.537, p = 0.59; phylogenetic ANOVA, F = 0.068 p = 0.786).

Marsupials do not show statistically different results for BV/TV across locomotor categories (One-way ANOVA, p = 0.42). Furthermore, there is not statistical evidence for differences between quadrupeds versus bipeds (t-test, t (22.2) =  − 0.991, p = 0.33; phylogenetic ANOVA, F = 2.02 p = 0.53), for pronograde vs. orthograde taxa (t-test, t (11.6) = 1.09, p = 0.29; phylogenetic ANOVA, F = 0.029 p = 0.911) or for arboreal versus terrestrial species (t-test, t (11.6) = 1.09, p = 0.29; phylogenetic ANOVA, F = 0.15, p = 0.74). The highest overall values for BV/TV in the sample are attributed to Dendrolagus (0.40), the tree kangaroo, a marsupial species that engages in a combination of saltatory locomotion, vertical climbing and even impressive leaping behaviors (Procter-Gray & Ganslosser, 1986).

Anisotropy

When examining the degree of anisotropy via mean intercept length values, the entire sample does not comply with any obvious or consistent locomotor-based trends (Fig. 4) despite a statistically significant ANOVA result maintained after an applied correction (One-way ANOVA, p < 0.001) and t-tests that reveal differences across certain included categories (Table 3). Several of the mean comparisons across locomotor modes are not supported as zero is encompassed within their 95% confidence intervals demonstrated via a Tukey HSD test (Online Resource 2, Fig. S2). When the sample is considered collectively, differences are also not statistically significant for bipeds versus quadrupeds (t-test, t (113) = 0.602, p = 0.59).

Fig. 4
figure 4

Degree of anisotropy plotted for pre-assigned locomotor groups across the entire sample and by individual groups. AQ arboreal quadrupeds, AQ, C arboreal quadrupeds with climbing capabilities, B brachiators, Sa saltators, SB striding bipeds, Su suspensory, TQ terrestrial quadrupeds, KW/TQ knuckle-walkers and terrestrial quadrupeds, VCL vertical clingers and leapers. Posture preference is depicted using grey to represent species with orthograde trunk orientation and orange to depict species with a pronograde trunk alignment. Depicted box plots represent the interquartile range (IQR) accounting for 50% of the data with box edges signifying the 25th percentile to the 75th percentile with the median pictured as a horizontal line residing within the box. The whiskers represent all the values within 1.5 times the IQR, unless the data do not exceed that range in which case the maximum and minimum values for the sample are used. Dots represent outliers that surpass the 1.5 × IQR threshold

Table 3 Pairwise t-test for DA values across different locomotor modes for the complete sample

However, there are statistically supported group differences in quadrupeds versus bipeds for DA in primates (t-test, t (17.5) =  − 2.83, p = 0.011) and rodents (t-test, t (12.2) = 2.95, p = 0.012) (Fig. 5). Postural preference (orthograde vs. pronograde) also appears correlated with the degree of trabecular orientation, as mean differences between these taxa across the sample are preserved regardless of their use of arboreal vs. terrestrial habitats (One-way ANOVA; p = 0.006) (Fig. 6). Habitat itself, however, is not a statistically supported variable when the entire sample is analyzed (t-test, t (107) = 1.33, p = 0.19). DA is generally higher for many of the pronograde taxa relative to orthograde arboreal species, a finding that supports other claims purported elsewhere in the literature (Ryan & Walker, 2010).

Fig. 5
figure 5

Degree of anisotropy for quadrupeds versus bipeds for marsupial, primate and rodent samples with trabecular VOI renderings of representatives for each respective locomotor category. Depicted box plots represent the interquartile range (IQR) accounting for 50% of the data with box edges signifying the 25th percentile to the 75th percentile with the median pictured as a horizontal line residing within the box. The whiskers represent all the values within 1.5 times the IQR, unless the data does not exceed that range in which case the maximum and minimum values for the sample are used. The adjusted p-values are also displayed from t-test with subsequent Holm–Bonferroni corrections showing discernible differences between bipedal and quadrupedal species in only primates and rodent samples

Fig. 6
figure 6

Degree of anisotropy by postural preferences across the entire sample with associated p-value from t-test showing detectable differences in means for individuals utilizing orthograde (grey) versus pronograde (orange) postures. Depicted box plots represent the interquartile range (IQR) accounting for 50% of the data with box edges signifying the 25th percentile to the 75th percentile with the median pictured as a horizontal line residing within the box. The whiskers represent all the values within 1.5 times the IQR, unless the data does not exceed that range in which case the maximum and minimum values for the sample are used

Regarding primates specifically, the knuckle-walkers and vertical clinger and leapers mean values are significantly more isotropic, while striding bipeds (humans) are the most isotropic in the sample, a trend that directly opposes the assumption that their repetitive loading on the hip joint should theoretically produce more anisotropic trabeculae (Fig. 4). Locomotor modes are statistically supported in primates (One-way ANOVA; F = 6.702; p =  < 0.001), though post-hoc Tukey test, again, reveal several overlapping locomotor categories (Online Resource 2, Fig. S3) and a phylogenetically corrected ANOVA failed to produce significant results (F = 2.20; p = 0.31). When independently assessed via regression models, which functions as a pooled two-independent sample t-test when conducted with categorical variables, little explained variance is offered by the preassigned locomotor modes (R2 = 0.40, p < 0.001). Postural differences do produce statistical support (t-test, t (42.3) =  − 4.74, p < 0.0001), though contributing only a small degree of explained variance (R2 = 0.29, p < 0.001). This is also the case for arboreal vs. terrestrial habitat preference (R2 = 0.16, p = 0.002). Specifically, of the included groups, arboreal quadrupeds are statistically distinct from knuckle-walkers and striding bipeds, while VCL and striding bipeds are clearly differentiated from a variety of modes (i.e., brachiators, suspensory taxa, terrestrial quadrupeds and VCL). However, DA mean value comparisons between arboreal and terrestrial primate taxa are also not significant after phylogenetic correction in Primates (phylogenetic ANOVA; F = 3.02; p = 0.28). The comparable DA values observed for Homo and the African apes, Pan and Gorilla, strongly suggests the possibility of phylogenetic influences governing DA values (Fig. 7).

Fig. 7
figure 7

DA trait values representing the species means of the sample mapped on to a phylogeny depicted here at genus level constructed for the a marsupials included in the sample and b the primates. Trees are based off of a previously published pruned version of a morphology/molecular data derived supertree and a molecular consensus tree, respectively

Marsupials show no statistically supported values for similar regression analyses using preassigned locomotor modes (One-way ANOVA, p = 0.24). Within the included marsupial sample, there are also no discernible mean differences in DA between terrestrial vs. arboreal taxa (t-test, t (21.4) =  − 1.59, p = 0.13; phylogenetic ANOVA F = 0.46; p = 0.65) nor are there significant differences between pronograde versus orthograde taxa (t-test, t (21.4) =  − 1.59, p = 0.12; phylogenetic ANOVA F = 0.86; p = 0.54). Treeshrew and rodent samples were again too limited to investigate all of the categorical variables with statistical accuracy.

Scaling

Phylogenetic generalized-least squares (pGLS) regressions were implemented for the primate and marsupial sample, but only the former evidenced a detectable size-driven phylogenetic influence on all trabecular regression residuals to the exclusion of BV/TV and DA (Online Resource 3). The phylogenetic models outperformed the traditional OLS regression in all instances suggesting that Tb.Th, BS/BV, Tb.N and Tb.Sp regression residuals are influenced by phylogeny (Online Resource 3; Online Resource 2, Fig. S4). Marsupials, on the other hand, do not suggest significant size-driven trends in either the standard OLS or the pGLS analyses (Online Resource 2), though they had smaller sample sizes and thus limited taxonomic sampling. Phylogenetic signal was also assessed via calculating Blomberg’s K-statistic for each variable in both marsupials and primates groups (Fig. 8), with only body size and Tb.N in the primate sample adhering to a BM mode of evolution that was statistically supported and confirmed via successive random permutation testing. The significant phylogenetic constraint on body size is typical given that an organism’s size evolves gradually over time compared to more labile behavioral traits which can change relatively quickly (Freckleton, 2002).

Fig. 8
figure 8

Blomberg’s K-statistic calculated for marsupial and primate samples using all trabecular parameters. K < 1 indicates that closely related species resemble each other less than expected under the Brownian motion model of trait evolution. K > 1 means that closely related species are more similar than predicted by the model. Accordingly, greater values of K suggest a stronger phylogenetic signal. BV/TV bone volume fraction, BS/BV bone surface-area-to-volume ratio, Tb.N trabecular number, Tb.Sp trabecular spacing, Tb.Th trabecular thickness, DA degree of anisotropy

In general, the sample itself reveals several of the trabecular parameters have a significant size-driven component with Tb.Th (R2 = 0.74) and BS/BV (R2 = 0.74) being the most influenced by size, and to a lesser extent Tb.N (R2 = 0.48) (Fig. 9) as determined by comparing log-transformed trabecular variables against log-transformed SI acetabulum height across the entire sample using RMA regressions. Specifically, BV/TV, Tb.Th, Tb. Sp. positively correlate body size, while Tb.N and BS/BV show an inverse relationship (Fig. 9). In general, there is evidence of a weak relationship between body size and BV/TV across the entire sample (R2 = 0.11; p < 0.001) (Fig. 9) though separate analyses show stronger correlations among rodents (R2 = 0.24; p = 0.015) and primates (R2 = 0.24; p < 0.001). Contrastingly, this size trend is not statistically supported in marsupials (R2 = 0.05; p = 0.15) or treeshrews (R2 = 0.30; p = 0.12).

Fig. 9
figure 9

Log–log RMA regressions for each trabecular metric vs. superoinferior (SI) acetabulum height for the entire sample. BV/TV bone volume fraction, BS/BV bone surface to volume ratio, Tb.Th trabecular thickness, Tb.Sp trabecular spacing, Tb.N mean trabecular number per unit length, DA degree of anisotropy. Note the high correlation between some trabecular parameters and SI acetabulum height (i.e., BS/BV, Tb.Th and Tb.N)

In terms of allometric scaling relationships, the results of this study suggest that BV/TV and BS/BV scale with positive allometry while DA, Tb.N, Tb.Th and Tb.Sp scale with negative allometry (Table 4). The positive allometry noted for BV/TV corroborates a previous study based on the femur and the humerus of various primate species (Ryan & Shaw, 2013). Doube et al. (2011), which offered a comprehensive allometric study of mammalian and avian femoral trabeculae (90 taxa with body masses ranging from 3 g to 3400 kg), found only a weak positive scaling relationship between body size and BV/TV confined explicitly to the avian taxa in their sample. Furthermore, Doube and colleagues also reported positive allometric relationships for both Tb. Th and Tb.Sp, concluding that larger animals have relatively thicker trabeculae that are spaced farther apart than that of smaller taxa. However, the results of this study coincide with those of Ryan and Shaw (2013) which found evidence of strong negative allometry noted for these parameters, instead proposing that although the larger mammals in the sample may have absolutely thicker trabeculae, has inferred from the positive relationship depicted by the slope on Fig. 9, relative to their body size the larger mammals tended to have thinner and more tightly compressed trabeculae than predicted under isometric scaling, or geometric similarity. This study corroborates the additional finding by Ryan and Shaw (2013) that larger mammals in the sample have slightly more bone per unit volume than the smaller individuals as well as fewer individual trabeculae. BS/BV, albeit scaling inversely with body size in absolute terms (i.e., there is less bone surface area relative to volume as body size increases), evidenced slight positive allometry indicating that this ratio is marginally higher in larger mammals than what is anticipated assuming an isometric scaling relationship (Ryan & Shaw, 2013). Barak et al. (2013a), using a sample confined to humans, mice and rats, also found a negative allometric relationship between body mass and Tb.N, along with evidence for negative allometry for Tb.Sp and Tb.Th which again parallel the findings generated by this study. Yet, the results presented here support the Barak et al. (2013a) assertion that rodents are generally “over built” for their actual small size and therefore make poor comparative referents. In general, this work supports interpretations that smaller animals have fewer, relatively thicker and more spacing between trabeculae when compared to larger taxa. This interpretation is also consistent with theoretical modeling of the trabecular bone that point to a metabolic mechanism as the cause of this negative allometry (Christen et al., 2015). The non-linear, inverse relationship observed among these trabecular variables throughout this study’s sample reiterate the complex relationship that they likely share (Online Resource 2, Fig. S5–S8). In fact, Saparin et al. (2011) demonstrated this complexity within primates, showing that areas of the femur subjected to higher loads showed increased BV/TV that was caused by trabecular thickening and not alternations to trabecular number, while areas in less frequently loaded regions evidenced lower BV/TV and lower trabecular number though trabecular thickness remained constant.

Table 4 Log–log RMA regression results for allometric investigations of the individual trabecular parameters using SI acetabulum height as a body size proxy for the entire sample

When primate trabeculae metrics are examined separately, BS/BV and BV/TV continue to scale with positive allometry (R2 = 0.74; p < 0.0001; Online Resource 2, Table S1), while the remaining parameters evidence the aforementioned inverse relationships to the exclusion of DA which is not significant (Online Resource 2, Table S1). Previous work found DA to be independent of body size in both primates (Fajardo et al., 2013) and in a large comparative mammalian sample encompassing a wide range of body sizes (Barak et al., 2013a; Doube et al., 2011). In fact, the current sample supports previous claims of size independence for DA, a result acknowledged elsewhere for other skeletal elements and now extended to the hip via this analysis (e.g., Barak et al., 2013a).

Discussion

Although there is a weak functional signal detected in the microarchitecture of the hip near the superior aspect of the acetabulum, this remains true only for certain locomotor modes and its expression differs depending on the taxonomic group under consideration. As such, the ability to reliably distinguish between locomotor modes and by broad locomotor categories (i.e., arboreal vs. terrestrial and bipedal vs. quadrupedal) across the sample is difficult due to the extensive overlap between the different modes and the noted influences inherently related to body size shaping certain trabecular parameters, especially within the primate sample. Accordingly, only highly specialized locomotor modes seemingly offer robust differences, a conclusion independently reached by Fajardo et al. (2007) using a primate sample to evaluate trabeculae in the femur.

Marsupials tend to have higher DA values relative to the primate sample that may be due to the inclusion of more terrestrial taxa in the former, though there are additional functional explanations, that appear independent of body size. This could also be evidence of a larger phylogenetic or metabolic influence that may be driving some of the unexpected variability. Within primates, the terrestrial, pronograde species of primates (i.e., Cercopithecus and Papio) are more anisotropic relative to the other taxa, while in marsupials the higher values are associated with the terrestrial Vombatus and next with the saltatory species Thylogale and Petrogale. Thylogale is known to use quadrupedal locomotion more frequently compared to other macropods and the Petrogale is unique in that it resides in highly unpredictable, rocky terrain that warrants specialized morphology required for scrambling, bounding and leaping behaviors not typically used by other macropods (Barker, 1990; Horsup, 1994). Marsupials also lack statistical support for differences between BV/TV and DA when compared to locomotor mode, posture and habitat variables. In contrast, primates do evidence correlations between their BV/TV values relating to their habitat and DA values that seem to partially correlate with positional behavior. However, after phylogenetic corrections were applied, many of the differences between locomotor groups become non-significant suggesting that these may be evidence of phylogenetic inertia rather than functionally relevant features. Since substantial body size increases, and the acquisition of orthogrady, characterize the African apes relative to the rest of the primate sample it remains a challenge to isolate allometric and phylogenetic effects from functional skeletal adaptation.

Interestingly, Homo, Pan and Gorilla are considerably less anisotropic than any other species in the sample, thus suggesting either a phylogenetic, or a size-driven, trend that is partially verified via the PGLS analyses (Online Resource 3). However, the phylogenetic contribution to DA appears to be minimal given its low value obtained from Blomberg’s K-statistic which does not garner statistical significance with additional permutation testing (K = 0.41 p = 0.31; Fig. 9). However, this could simply be because the pattern of evolution is not captured by a Brownian motion model and would benefit from investigation with more advanced comparative modeling methods. It is also plausible that their varied locomotor repertoires require different loading patterns throughout the acetabulum region that favor this particular configuration. Yet, given that DA is believed to reflect the directionality of habitual loading, humans should theoretically be highly anisotropic. Instead, the opposite trend is observed in this analysis. Former work by Ryan and Shaw (2012) found Pan to exhibit dense and isotropic trabecular patterns in the femur relative to terrestrial species like Macaca and Papio who are generally more anisotropic, with the latter finding being consistent with the results presented in this analysis. Furthermore, suspensory taxa such as Symphalangus and Pongo had previously been reported as having more isotropy in the center of the femoral head (Ryan & Shaw, 2012), which is inconsistent with the results presented here, as the Pongo specimens included in this sample tended to exhibit DA values that are more comparable to the terrestrial species included in this study.

Previous studies that examined modern humans relative to other hominoids also found them to be considerably more anisotropic in terms of femoral trabeculae (Ryan & Shaw, 2015), again, contrasting with the isotropic values obtained for humans in this sample. Perhaps this could evidence a compensatory relationship between the femur and pelvis, in which the femur directs the principal stresses in an anisotropic manner while the hip counters this by dissipating force through a more isotropic distribution pattern throughout the acetabulum itself. Such hypotheses should be explored further to refine functional interpretations for entire joint complexes like the hip. Further, restrictions on trabecular thickness imposed by osteocyte functional limitations or other metabolic constraints, like calcium homeostasis, affecting trabecular bone in larger mammals could explain some of the findings of this study including the negative allometry noted for Tb. Th, Tb. Sp and Tb.N (e.g., Christen et al., 2015; Mullender et al., 1996; Swartz et al., 1998).

Certain human populations also evidence substantial differences in BV/TV related to increased sedentary lifestyles (Chirchir et al., 2015, 2017) that could explain unusual patterning observed in other variables that are attributed to metabolic processes, or age-related changes, or that may also reflect specific loading differences unique to the human sample used in this study. The individuals in the human sample for this study had an age range between 36 and 75 years, and although the youngest individual had the highest BV/TV value, the oldest did not have values significantly lower than the others, making it unlikely that age alone could explain the lower density values. Former studies of the human tibia revealed it to be highly anisotropic despite having lower BV/TV values compared with the femur, showing perhaps a form of compensation in which bone strength is obtained via reorienting trabecular rather than increasing bone volume in elements that benefit functionally from remaining lighter (Saers et al., 2016). However, similarities of the general orthogonal arrangement of the trabeculae between chimpanzees and humans were also noted in previous work on the corpus of the ilium with the authors suggesting that, despite their vastly different locomotor and postural behaviors, the loading environment itself may still be quite similar (Martinón-Torres, 2003). Furthermore, studies have found this to be the case in related structures like the vertebral column thereby adding some credibility to this explanation (Brown et al., 2002). There is also evidence of potential systemic patterning within species that may explain some of the variability in these parameters for certain taxonomic groups despite their locomotor preferences, a possibility that should be explored via further comparative investigations (Tsegai et al., 2018).

Within the marsupial sample, Phascolarctos, Setonix and Macropus are among the more isotropic species in the sample, with the first being a highly arboreal species engaging in vertical climbing, the second also being documented as utilizing climbing behaviors, and with Macropus being a large-bodied saltator. This makes it difficult to reconcile whether the lack of trabecular anisotropy is a result of varied loading behavior, larger overall body size and/or the fact that Macropus is highly specialized relative to other taxa given the soft tissue accommodations that maximize elastic energy storage during saltatory locomotion. Unfortunately, investigations into trabecular bone adaptation in marsupials (though see Biewener et al., 1996) is sparse compared to placental mammals consequently limiting the ability to compare this work with previous studies on other skeletal elements.

Conclusions

The analyses conducted in this study have effectively (1) identified which locomotor modes are statistically discernible using BV/TV and DA values across the different mammalian groups, (2) extended the known allometric scaling trends observed in other elements to include the hipbone and (3) demonstrated the presence of a detectable phylogenetic signal governing the expression of certain trabecular bone parameters within primates. In addition, contrary to the study’s initial hypothesis, not all hindlimb dominated forms of locomotion generated higher BV/TV values as anticipated; in fact, striding bipeds and the vertical clingers and leapers are significantly lower than many of the other taxa included in this study despite their extensive use of the hindlimbs while moving. This finding could also be an artifact of the limitations imposed by using a simplified locomotor classification system and would thereby benefit from further study integrating more detailed behavioral data. For example, the ability for terrestrial mammals to avoid high forces via altering gait speed, compliance and posture remain tenable explanations for the lack of a consistent locomotor signal in the hipbone trabecular bone across this study’s diverse sample (e.g., Biewener, 1989).

DA values were able to decipher some general locomotor-related trends and the use of orthograde versus pronograde trunk positions in primates. DA mean values were generally lower for the suspensory taxa than for quadrupeds included in the sample, though they are not consistently lower than all of the terrestrial taxa. As such, both BV/TV and DA values are only marginally informative for understanding the frequency and overall functional contexts for loads encountered at the hip joint, because although they detect differences, they are not clearly attributed to a discrete functional role. Accordingly, careful consideration of motion associated with joint excursion and the interaction between trabecular microarchitecture and the gross morphology of nearby articular surfaces should be the focus of future work.

The allometric component of this study elucidates several scaling relationships between the trabecular parameters and body size. Specifically, BV/TV and BS/BV scale with positive allometry while DA, Tb. Th, Tb.N and Tb.Sp scale with negative allometry, a finding consistent with previous work done on the femur using mammalian samples (Barak et al., 2013a; Mullender et al., 1996; Ryan & Shaw, 2013). This sample showed BV/TV was more correlated to size in the smaller taxa (rodents and treeshrews), though phylogenetic effects and inadequate sample sizes for these groups may explain some of the observed variance. And despite their small body sizes, these groups do not necessarily resemble one another in their specific scaling trends thus making it more plausible that taxon-specific influences may also obscure any functional signal when assessing it on a broader comparative level. However, scaling trends revealed by this study confirm previous work that suggests rodents are generally overbuilt for their size and that larger primates have fewer individual trabeculae that are relatively thinner than what is anticipated for their body size.

The complex interplay of various factors that shape skeletal microarchitecture make it difficult to confidently infer locomotor modes, especially among those species utilizing varied repertoires that overlap, and should therefore be interpreted cautiously. Additional study is needed to avoid oversimplification of both the existing locomotor mode classification systems and the joint loading environment itself, and to appreciate the effects of potential systemic patterning of trabecular bone and the influence of trabecular bone sampling protocol choices (i.e., isolated VOIs versus sampling of entire regions). Accordingly, a useful extension of this study would be to investigate the actual directionality of trabeculae alignment via subdividing the hip joint into smaller volumes surrounding acetabulum to better contextualize the strain environment, a methodological solution posed elsewhere (e.g., Ryan & Test, 2007). However, despite the aforementioned limitations, the findings of this study are consistent with other comparable works utilizing different samples, skeletal elements and methodological approaches and, given the inability to reliably distinguish between locomotor categories, with the conclusions drawn by Ryan and Walker (2010) that suggest the hip joint itself may not be informative in such contexts.