Virtual morphometric method using seven cervical vertebrae for sex estimation on the Turkish population

Sex estimation from skeletal remains is crucial for the estimation of the biological profile of an individual. Although the most commonly used bones for means of sex estimation are the pelvis and the skull, research has shown that acceptable accuracy rates might be achieved by using other skeletal elements such as vertebrae. This study aims to contribute to the development of sex estimation standards from a Turkish population through the examination of CT scans from the seven cervical vertebrae. A total of 294 individuals were included in this study. The CT scans were obtained from patients attending the Bakirkoy Training and Research Hospital (Turkey) and the data was collected retrospectively by virtually taking measurements from each cervical vertebrae. The full database was divided into a training set (N = 210) and a validation set (N = 84) to test the fit of the models. Observer error was assessed through technical error of measurement and sex differences were explored using parametric and non-parametric approaches. Logistic regression was applied in order to explore different combinations of vertebral parameters. The results showed low intra- and inter-observer errors. All parameters presented statistically significant differences between the sexes and a total of 15 univariate and multivariate models were generated producing accuracies ranging from a minimum of 83.30% to a maximum of 91.40% for a model including three parameters collected from four vertebrae. This study presents a virtual method using cervical vertebrae for sex estimation on the Turkish population providing error rates comparable to other metric studies conducted on the postcranial skeleton. The presented results contribute not only to the development of population-specific standards but also to the generation of virtual methods that can be tested, validated, and further examined in future forensic cases.


Introduction
Sex estimation through the examination of skeletal remains is one of the first steps in creating a reliable biological profile and plays a key role in terms of identification [1,2]. In most cases, sex must be estimated before age, ancestry, and stature due to biological differences between males and females having an impact on the assessment of other pieces of biological information [2,3]. Anthropological sex estimation consists of two main methodological approaches: morphological and metric analyses [2]. Morphological approaches are based on the visual evaluation of sexually dimorphic features and are mainly focused on the pelvis and skull, but also on the overall status including the differentiation of the robusticity of the bones and observations of various muscle marks [2]. On the other hand, metric methods are based on size differences between female and male individuals and typically use other postcranial elements along with the skull and the pelvis [2].
Metric approaches remain the most commonly used methods for sex estimation among forensic anthropologists [4]. Although the most reliable sex classification results are obtained from the analysis of the pelvis and skull [5], in situations such as natural and/or mass disasters, it may not always be possible to examine these skeletal elements due to animal activity, deliberate damage and skeletal disruption due to taphonomic processes. Jantz and Ousley [4] reported that postcranial measurements showed greater shape dimorphism than did cranial measurements, both individually and in combination. Thus, sex estimation may be required through the examination of other skeletal elements, as demonstrated by previous studies focusing on metatarsals, carpal bones, long bones, scapula, clavicle, patella and sternum [1]. In forensic investigations, methods used in sex estimation are expected to have an accuracy rate of more than 80% [3].
Metric methods use statistical analysis involving various approaches such as discriminant function analysis and logistic regression that provide equations to estimate sex in an unknown individual [6]. This makes it easier to assess quantitative outcomes obtained by osteometric approaches [1,7]. With the development of molecular techniques, sex can be determined more reliably by identifying the presence or absence of the Y chromosome in the analysis performed from skeletal remains [2]. Molecular methods may be useful especially in juvenile skeletal remains, where osteological sex estimation methods present significant limitations [2]. Sexual dimorphism in the skeleton begins to develop with the release of sex hormones during puberty and becomes obvious around the age of 17 in many populations [2]. For this reason, many anthropologists consider sex estimation from the skeleton controversial before the age of 15, where molecular methods might assist in sex discrimination better than osteological sex biomarkers [2]. However, molecular methods entail a sophisticated analysis requiring high skills and expensive and advanced laboratory equipment [1]. Thus, they are considered complicated, costly, invasive, and time-consuming [1,8]. The upper spine is one skeletal segment that remains to be further explored for sex estimation. However, sex estimation studies on the cervical vertebrae are limited and mostly focus on the axis and atlas or on the seventh cervical vertebra due to their atypical morphological structure and easier identification [9][10][11][12][13][14]. Research performed on the remaining cervical vertebrae in terms of sex estimation is thus limited [15], but accuracies over 80% have been reported by previous studies [9][10][11][12][13][14][15][16]. As the degree of sexual dimorphism and body proportions can differ between populations, and the most accurate vertebral sex markers may differ between geographical samples, population-specific studies are required to provide the most accurate outcomes [6].
At present, no research has been conducted on a Turkish population regarding sex estimation using a segment of the spine. Thus, our study aims to explore the possibility of developing population-specific standards for a Turkish sample by acquiring osteometric parameters through CT images of the seven cervical vertebrae and was the aim to provide alternatives in case the whole skeleton is not preserved.

Image acquisition
All cervical CT examinations were performed using a 128 slice multidetector CT scanner (Siemens Medical Solutions, Erlangen, Germany). A routine cervical CT protocol was followed with a 1-mm slice thickness in the supine position. Tube voltage was 120 kV; effective mAs was adjusted by Siemens "SAFIRE reconstruction software." Gantry rotation was 0.5 s, collimation was 0.6 mm, and the pitch was 1.2 mm. All patients underwent imaging from the base of cranium to thoracic inlet.

Image analysis
Three measurements were collected from each of the seven cervical vertebrae: maximum cervical vertebral body height (CHT), cervical anterior-posterior diameter (CAP), and cervical transverse diameter (CTR); CHT was not collected for the first cervical vertebra, as this vertebra does not have a centrum. In total, 20 measurements were taken for the purposes of this study.
The measurement parameters we have determined for the spinal canal and vertebral corpus in each vertebra at the level of C1-C7 vertebrae are explained below and showed Figs. 1, 2, 3 and 4.
First, coronal and sagittal reformatted images were created from axial source images.
Secondly, for spinal canal measurement at the level of each cervical vertebra corpus, in the coronal reformat images, the midline was determined with reference to the odontoid process ( Fig. 1c, blue reference line). Then, in the sagittal images, the line that fits the vertebral corpus posterior contour to be measured for each vertebra was chosen as a reference (Fig. 1b, green reference line). After these two arrangements, the section level where the spinal canal can be viewed continuously in the axial images was chosen (Fig. 1a). In this section, spinal canal anterior-posterior diameter and mediallateral diameter measurements were made (Fig. 2).
Thirdly, for height measurement at the level of each cervical vertebral corpus; In the axial source images, the anteriorposterior diameter and medial-lateral diameter of the vertebral corpus were determined, and the reference lines in the axial source and sagittal reformat images were localized to this point ( Fig. 3a and b). As a result, vertebral corpus height was determined by measuring between the superior-inferior endplates in coronal reformatted images ( Fig. 4) All measurements were performed by two independent observers. Both were experienced radiologists, fully trained in musculoskeletal radiology and forensic imaging. Each had 10 years of experience in the field of forensic radiology and completed different virtual anthropology studies. Observers have also studied published articles on the cervical vertebra and were trained in measurement techniques with an experienced forensic pathologist.
Statistical analysis SPSS 22.0 (IBM Corporation, Armonk, NY, USA) program was used for statistical analyses.
Observer error was measured through technical error of measurement (TEM) analysis [17]. Intra-observer error was assessed on 40 random CT images with a two-week interval between the first and the second observation. Inter-observer error was performed on the same 40 CT images by a second observer.
From the total sample set (N = 294), 70% (N = 210) was used for the generation of the logistic regression (LR) equations and 30% (N = 84) was used as a validation set with the main statistical analysis being performed using the developmental set. First, the dataset was examined to assess normality, outliers, skewness and kurtosis. Secondly, all the parameters were examined to explore whether statistically significant differences exist between males and females. Binary logistic regression (LR) was the statistical approach used in this study, as the data did not meet the assumptions for discriminant function analysis. Moreover, LR is more flexible as it is less dependent to outliers and generally more tolerant to co-linearity of predictors [9]. LR allows the discrimination between groups following the following formula: where L is the logit or log-odd, C is the constant and b and X are the regression coefficient and the measurement, respectively. The cutting point is set at 0.5 with males scoring over this value. LR modeling was performed considering the identification of vertebrae number, the combination of variables and the recovery and preservation of a partial or full set of cervical vertebrae. Models were evaluated based on LR assumption and overfitting considering the possibility of including the fewer number of variables (reducing measurement error) and achieving the highest correct classification.
LR modeling was first run by each independent vertebra including manually all three parameters in the model (CHT, CAP, and CTR). The second step in the generation of the most optimal models consists of the combination of two consecutive cervical vertebrae (e.g. C2 and C3, C3 and C4, and so on) in an attempt to provide equations for vertebrae that can be identified based on the articulation with the consecutive one. Note that some model modifications were made after the observation of model fit indicators, and in some instances, variables no contributing to the model were removed in order to improve the fit. LR was also run on three consecutive vertebrae although the results did not show any improvement as compared to the two consecutive vertebrae, and therefore, these models are not reported. The third step was to use C1 to C7 to create a single equation contemplating the scenario in which the full set of cervical vertebrae is intact and present. Note that even if the total of number of parameters assessed in this study is 20, a maximum of 10 variables is recommended Fig. 3 Determination of the CHT with axial (a), sagittal (b), coronal (c), and three-dimensional projections of section plans (d) reconstructed CT images Fig. 4 Measurements of CHT on coronal CT image due to statistical constrains based on the sample size [18]. As CHT was reported to be the most sexually dimorphic parameter, stepwise forward LR was performed on all CHT measurement for all cervical vertebrae, followed by the inclusion of CTR and CAP block of measurements, separately. In the last step of the statistical analysis, stepwise forward LR modeling was used including all 20 measurements to identify the most optimal combination of variables. Only LR equations that achieved overall correct sex classification over 80% are reported here. The selected models were then cross-validated using the training set to test the stability of the models with those that hold within 10% classification accuracy of the original primary sample being considered valuable models [19].
The individuals included in this study were divided into five age cohorts (18-29, 30-39, 40-49, 50-59, and 60 years old and above). Defined age groups were tested for normality using Shapiro-Wilk test. Normality was violated on all occasions and the null hypothesis of equal covariance matrices was rejected, and thus, a non-parametric test (Kruskal-Wallis test) to explore if there are any statistically significant differences between age groups for cervical measurements at significance level of p < 0.05.

Results
This project includes 41,160 metric data obtained from CT images of 2058 vertebrae of 294 adults from contemporary Turkish population.
Observer error results are shown in Table 1. Both intra and inter-observer errors fell within the limits of acceptance as seen by the low values for rTEM and R. The only two parameters showing more than 10% of the variance related to variability between subject scores were reported for C3HT and C5AP (R = 0.89).
Demographic data for males and females for both the developmental set (106 males and 104 females) and the training sets (40 males and 44 females) are presented below (Table 2).
A sexual dimorphism indicator (SDI) per measurement was calculated following Gama et al. [9]. If the index is higher than 10%, the parameters are considered to be strongly sexually dimorphic: SDI = ((Males mean − Females mean) / Males mean)*100.
Regarding the assessment of normality, only two parameters, C4HT and C1AP, were non-normally distributed as indicated by Shapiro-Wilk test (p value < 0.05). Thus, those measurements were subject to non-parametric statistical tests.
To examine differences in cervical measurements between males and females, an independent sample t test or Welch test were performed based on the results provided by Levene's test of equality of variances, while the non-parametric equivalent (Mann-Whitney U test) was performed on C4HT and C1AP. All normally distributed parameters demonstrated statistically significant differences between the sexes with all p values being less than 0.001 (Table 3). Additionally, Mann-Whitney U test indicated statistically significant differences for C4HT and C1AP (p < .0001, z = − 9.84 and z = − 7.75, respectively). Table 4 shows the results for LR modeling by each independent vertebra inserting manually all three parameters in the model (CHT, CAP, and CTR). All the LR equations generated using parameters from each independent vertebra demonstrated statistically significant models in comparison to the null model. Note that model M2 does not include all three parameters because a better-fit model was created by the exclusion of those parameters that did not contribute significantly. The percentage of correct classification ranges from 83.8% for M5 up to a maximum of 87.6% for M2 (refer to Table 7). Note that cervical 1 is not reported as the percentage of correct classification was lower than 80% (75% accuracy).
Measurements from two consecutive vertebrae were combined and LR models were generated ( Table 5). The resulting models were statistically significant and correctly classified individuals with percentages ranging from 86.70 to 88.10 (refer to Table 6). Stepwise LR was performed by combining first all set of the most dimorphic parameters (CHT) and then adding separately CTR and CAP cervical measurement sets. As seen in Table 6, both M14 and M15 include cervical 2, 4 and 7. For the model created by the inclusion of all 20 parameters, four different vertebrae are necessary in order to use this equation  (Table 7). Regarding the statistical analysis for each measurement according to age groups, a significant difference was observed for C5AP (p = 0.040) and C4TR (p = 0.048) for males and C3HT (p = 0.000), C3AP (p = 0.038), C6AP (p = 0.005), C3TR (p = 0.003), and C4TR (p = 0.025) for females.

Discussion
The current study examined the seven virtually reconstructed cervical vertebrae to establish an accurate sex estimation method for Turkish population. Morphometrically, 20 linear parameters of the seven cervical vertebrae were measured on CT images, and their predictive accuracy ranged from 83.30% to 91.40% as a result of logistic regression equations.
The cervical spine consists of three atypical segments (C1, C2, and C7) and four typical features (C3-C6). The vertebral body consists of several projections used for articulation with the vertebral arch [20]. The first vertebra (C1) is the largest, annular in shape and lacking a vertebral body and spinous process [20]. The second cervical vertebra (C2) includes an odontoid process extending from the vertebral body [20]. The seventh cervical vertebra (C7) presents a prominent spinous process, known as the vertebrae prominence, permitting differentiation from the third to sixth cervical elements [20]. The morphological characteristics of the cervical vertebrae make it easy to create anatomical sequences among all skeletal elements, including consecutive vertebrae [15]. The vertebral body provides strength and support to two-thirds of the vertebral load and is resistant to mechanical stresses and taphonomic changes due to its strong cortical and dense internal trabecular bone structure [21]. The vertebral column has an intact structure that is less affected by taphonomic changes, yet the spinous and transverse processes are more sensitive to taphonomic alterations and may become fragmented [15,21]. Therefore, it may be more difficult to collect metric data from these specific landmarks, and further consideration must be taken when developing metric methods from these skeletal elements.
The traditional anthropological assessment of bones consists of the direct observation of skeletal remains [5]. In the past two decades, CT and three-dimensional (3D) reconstruction have played an important role in an increasing number of forensic cases and mass disasters due to the potential  application of medical imaging to forensic anthropology [22]. Research has also been conducted in contemporary populations to test and review traditional anthropological methods and generate population-specific data [1,23].
To provide accurate forensic anthropological information for the identification of unknown individuals, the methods used in the identification process must be tested for error rates [24]. It is well known that among the wide range of techniques used, metric analysis is considered less subjective than morphological approaches, as it is subject to statistical analysis [6]. Moreover, population-specific methods have shown higher accuracies when applied to target populations closer to the reference population. The current study aspires to develop sex estimation equations based on cervical vertebral dimensions in modern Turkish sample employing data from Computed Tomography (CT). This is the first study on the subject in this population.
It is important that the methods and techniques used in forensic anthropology are reproducible and highly reliable [2,3]. In this respect, a method must be consistently reproducible. Less than 10% errors can be accepted in intra and interobserver error analysis [2][3][4][5]. In this study, intra and interobserver error tests were applied and showed that there were less than 10% error variations for all intra-observer errors. In inter-observer errors, it showed that there were less than 10% error variations for all other measurements except C3HT and C5AP (R = 0.89) ( Table 1). Rozendaal et al. [15], in which similar measurement parameters were evaluated, reported error variations of less than 10% for both the Athens and Luis Lopez collections. However, it was reported that the interobserver error was detected 25.18% for C1TR, which may be due to the misunderstanding of the measurement technique by a researcher. Acceptable error values detected in both studies support that the measured values for the cervical vertebra are reproducible. However, the difference in measurement techniques should be considered. Past studies have shown a high similarity between direct measurements of dry bone and measurements from radiological images of the same bone [25,   26]. Radiological methods are suitable, and CT images in particular can be used in anthropological practice [27]. In addition to CT images, image analysis programs are becoming more common and can contribute to the speed and accuracy in the analysis. Furthermore, the use of radiological images can help address ethical and cultural concerns in situations such as the need to perform maceration involving studies of human remains. Another important point is that medical imaging provides timesaving and rapid evaluation as well as archiving opportunities, especially for mass disaster incidents, including the identification and assessment of trauma. In terms of forensic anthropology, virtual population-specific databases might provide the opportunity for researchers to evaluate, validate and develop methods when osteological collections are not available, with retrospective examination of radiological images assisting in increasing the number of cases that can be examined.
For the purpose of the study, three measurements (CHT, CTR, and CAP) were obtained from each vertebra and all showed significant differences between females and males (p < 0.001). The sexual dimorphism indicator (SDI) for the Turkish population was found to be above 10% for the cervical vertebrae maximum body height (CHT) measurements of the vertebrae (C2-C7), indicating that this parameter is strongly sexually dimorphic. Rozendaal et al. [15] performed a study on cervical vertebrae from the Athens and Lopes skeletal collections testing the same parameters as in the present study on the dry bone using Vernier calipers. CHT was found to be the most dimorphic measurement, followed by cervical vertebral foramen transverse diameter CTR [15]. However, Rozendaal et al. [15] stated that the cervical vertebral foramen anteriorposterior diameter CAP measurements were higher in males than in females, although not to a statistically significant degree, except for those of the C1 vertebrae. Marlow et al. [16] and Wescott [28] stated that the most sexually dimorphic parameters in the C2 vertebra included CHT and CTR and that CHT showed lower sexual dimorphism than CTR. Gama et al. showed that the CAP measurement for the C2 cervical vertebra was not significantly different between the sexes, providing a sexual dimorphism index of 2.7 [9]. In the present research, the CTR measurement of the C2 cervical vertebra was not included in the M2 model because it did not make a significant contribution to sex prediction. Amores et al. obtained eight measurements-with the exception of CHT-from the seven vertebrae, with the anterior and posterior distance of the vertebral canal being the most sexually dimorphic and the anterior-posterior distance showing a higher degree of sexual dimorphism than the transverse width [14]. Kibii et al. reported that the centrum in the seventh vertebra shows a higher degree of sexual dimorphism than the vertebral canal and that the vertebral canal CAP and CTR measurements may not show sexual dimorphism in different populations [13], as corroborated in our study, in which C7AP and C1AP demonstrated sexual dimorphism indexes of 8.40% 8.28%, respectively. Vertebral body heights reach full skeletal maturity in the 20 th year of life and thus are more affected by secondary sexual development and environmental effects than vertebral foramen measurements [15,29]. The CHT measurement shows a higher degree of sexual dimorphism in all cervical vertebrae than the CTR and CAP parameters due to differences in the developmental stages of CHT [15]. Some studies have reported CAP measurements showing minimal sexual dimorphism [15,30], although other studies have indicated that CAP measurements demonstrate different degrees of sexual dimorphism when assessed in different populations [28,31,32]. In our study, sexual dimorphism was indeed observed in the CAP measurements. The sexually dimorphic aspect of the base of the skull would affect the morphological structure of the C1 vertebra because of the association between the two structures [15,28,31]. This relationship may affect CAP measurement in the C1 cervical vertebra explaining the sexual dimorphism reported here [15].
The CHT, CTR, and CAP measurements were evaluated together, and logistic regression analysis was performed for each cervical vertebra. The accuracy increased from 83.8% in the C5 vertebrae to 87.60% in the C2 vertebrae. On the other hand, the accuracy in the C1 vertebrae was estimated to be less than 80% [75%]. Andrew et al. stated that no cervical vertebra alone can exceed 80% accuracy for sex estimation [15].
Given the articulation between the cervical vertebrae, when two consecutive cervical vertebrae were evaluated together (LR models M8-M13), the accuracy ranged between 86.70 and 88.10%, and was found greater than those of the singlevariable models. Rozendaal et al. combined both the C1 and C2 vertebrae and showed an accuracy of 72.8%; however, C2 and C5 showed an increase in correct classification, resulting in an accuracy of up to 77% [15]. Furthermore, when three consecutive vertebrae were evaluated together in our study, there was no remarkable increase in accuracy relative to that of models generated with two consecutive vertebrae.
In our study, CHT demonstrated the highest degree of sexual dimorphism, and the measurements from all cervical vertebrae were included into a model by stepwise logistic regression analysis (model 14); the resulting accuracy was 87.60%. Additionally, all CTR measurements were included in model 15, yielding an accuracy of 90%. The highest correct classification achieved in this study was 91.40%, which was obtained with a model including five measurements selected by stepwise LR from all cervical vertebrae (model 16). The Rozendaal et al. statistical model included the twenty CHT, CTR, and CAP measurements in total from all cervical vertebrae and yielded an accuracy of 84.1% [15]. When they used stepwise method, seven measurements (C1AP, C2HT, C2TR, C3HT, C5TR, C5HT, C7TR) exhibited large t-value coefficients and the discriminant function resulted in accuracy rate of 82.6%. In our model 16, it was striking that there was no CTR measurement. As expected, higher accuracy is obtained by using more measurement parameters and more vertebrae [11,15,16,28].
Along with the dynamics of pathological processes and aging processes in the vertebrae, vertebral anatomy can be directly affected. Degenerative processes in the cervical spine are usually presented in an idiopathic form where no predisposing factor is obvious or a congenital form in which a predisposing factor such as a metabolic disorder and trauma is present. The idiopathic form is related to aging, and the aging process in the cervical spine can cause numerous pathologies involving both surrounding tissue and bone structure [33][34][35][36][37].
Ezra et al. [37] reported that while cervical vertebrae height decreases with age (mainly in C3-C6), the vertebral body expands, vertebral foramen size is independent of age, and emphasized that the enlargement of vertebral bodies may be a secondary mechanism to a decrease in vertebral body height with age. In our study, a difference was found for more parameters in females than in their male counterparts. In the interpretation of the obtained data in relation to age differences, biological and genetic factors, pathological processes and population-specific features related to sex that can affect vertebral dimensions should be revealed in more detail in further research. In this respect, before analyzing the direct effect of the data obtained in our study on sex estimation, it is necessary to consider all these factors in more detail and prospective studies with a better clinical and socio-economic history can be conducted in future studies with an equal and homogeneous age distribution between sexes. As a final note, despite the evidence that the vertebral body exhibits age related changes, introducing the age factor in sex estimation formulae has little practical significance. In essence, if sex needs to be estimated from the vertebrae, this means that accurate age estimation would also be difficult to be performed on the same set of human remains, thus, making the application of agespecific sex estimation formulae impossible. Thus, one may consider using the method, taking into account this inherited drawback, a certain bias on the sex estimate due to the effect of age.
Many studies have demonstrated the impact of sex estimation formulae on skeletal remains that are not closely related to the reference population [4,6,15,32]. Genetics, environmental effects, socio-economic status, and secular differences, among others, are known to affect the size and structure of the skeletal system. Thus, differences between the data of our study and those of other studies are expected based on the aforementioned factors.
Although further validation is required to test the direct applicability of virtual methods on dry skeletal elements and to validate the application of the present formulae to other populations, our research presents the first sex assessment method performed through 3D images of cervical vertebrae in this population and constitutes a contribution to the further development of Turkish population-specific standards.