A detailed method for creating digital 3D models of human crania: an example of close-range photogrammetry based on the use of Structure-from-Motion (SfM) in virtual anthropology

Photogrammetry is a survey technique that allows for the building of three-dimensional (3D) models from digitized output data. In recent years, it has been confirmed as one of the best techniques to build 3D models widely used in several fields such as life and earth sciences, medicine, architecture, topography, archaeology, crime scene investigation, cinematography, and engineering. Close-range photogrammetry, in particular, has several applications in osteological studies allowing to create databases of 3D bone models available for subsequent qualitative and quantitative studies. This work provides a step-by-step guide of the photo acquisition protocol and the photogrammetric workflow for creating high-resolution 3D digital models of human crania. Our method, based on Structure-from-Motion (SfM), uses single-camera photogrammetry to capture chromatic details and reconstruct shape with a scale error of less than 1 mm. Using relatively inexpensive and easily transportable equipment and a quick and simple protocol, realistic and accurate models can be be produced with minimal effort.


Introduction
Photogrammetry is a survey technique that allows for the building of 3D models starting from digitized output data (2D images) (Mikhail et al. 2001). It identifies the spatial positions of all the features (shapes and colours) of the considered object by detecting a series of local motion signals: arbitrary blocks of pixels used as motion vectors. Local motion signals are determined through a technique called Structure-from-Motion (SfM) in which the camera is fixed and the target rotates or the camera moves around the fixed target (Schonberger and Frahm 2016). Local motion signals are, therefore, used to determine and calibrate the object points' spatial position that will shape the model. Today, photogrammetric 3D models are widely used in several fields such as life and earth sciences, medicine, architecture, topography, archaeology (Alliez et al. 2017;Chodoronek 2015;Ducke et al. 2011;Pescarin et al. 2013;Profico et al. 2019), crime scene investigation, cinematography, and engineering (Linder 2009).
Currently, photogrammetry is an affordable, powerful, readily available, and versatile technique for surface data acquisition, which is able to guarantee levels of detail and precision equivalent to other tools that are generally more expensive and bulkier such as a laser scanner or CT (De Reu et al. 2013;Jurda Ubanova 2016;Rangel-de Lázaro et al. 2021: White and. Nowadays, its wide use in biological anthropology (Bennett 2015;Katz and Friess 2014;Morgan et al. 2019;Perevozchikov 2019;Lussu and Marini 2020;Omari et al. 2021) is due to the increasing demand of three-dimensional copies of anatomical models supported by the availability of programs and high-performance personal computers and digital cameras that can generate high-resolution models.
The resolution of the photos and the photogrammetry workflow are essential to capture and maintain details and to scale the models with a high degree of accuracy. In addition, along with the hardware used, they both greatly influence the costs and time required for the generation process. Starting from these general issues, the objective of this study is to provide a step-by-step guide of an easy photo acquisition protocol and a photogrammetric workflow useful for building high-resolution digital 3D models of human crania combining SfM and close-range photogrammetry. The scientific method proposed makes it possible to obtain high-resolution models with relatively inexpensive equipment and minimal effort by manipulating only a few of Agisoft-Metashape main settings and tools. This eases all the operational constraints of tool transportability necessary for 3D documentation both inside and outside of a laboratory.
Considering that the time frame is strongly influenced by the photos (number and resolution) and by the hardware used, we optimized times and costs using a medium-high performance camera and hardware that allowed us to have a complete model in approximately 2 h (around 15 min for shooting, around 20 for the conversion from RAW to JPG, and around 1.15 h for the workflow).
As the photogrammetric model accuracy has been demonstrated (Morgan et al. 2019;Westoby et al. 2012) to support the standardisation of the method, we associated our data with a brief statistical analysis of the possible systematic errors that could be introduced by the instruments or by the operator.

Materials
Forty-five adult crania were selected using the standards proposed by Buikstra & Ubelaker (Buikstra et al. 1994;Ubelaker 1989), from the collection of Alia (Sicily, Italy) based on their integrity; broken or incomplete specimens (bones not in anatomical connection and/or lacking landmarks necessary to take the anthropometric measurements mentioned below) have been excluded before the analysis. All the specimens are part of a nineteenth century archaeological collection of remains that were buried in a partially modified natural cave (Mannino 2016) during the 1837 cholera outbreak and subsequently rediscovered in 1996 (Chiarelli et al. 2002). To ease reading, the specimens are referred to using the progressive identification numbers A.01, A.02, A.03, and so on.

Photography settings
The skulls were photographed using a 20-megapixel Nikon Bridge Colpix B700 digital camera equipped with a Nikkor 60X Wide Optical Zoom ED RV 4.3-258 mm 1:3.3-6.5 lens mounted on a Manfrotto MKCOMPACTLT-BK tripod positioned 30 cm from the specimens. All images were taken in manual mode (ISO 200, shutter speed 1/30, f-stop 3.3 with a fixed focal distance of 24 mm) in RAW format and then converted to high-resolution JGP format using Camera Raw 10.3 software (alternatively, Photoshop Express Free is a functional equivalent software) (Table 1 -I).
In order to acquire the merge coordinates, four 2-mm markers were drawn, once for each side of the specimens' skulls, with a pencil (FB graphite) before the photography process. In detail, these markers were placed one on the frontal bone, one on left and one on right temporal bone, and one on the occipital bone (Fig. 1a). Markers can be positioned all over the surface of the four mentioned bones without influencing the process. The advantages in this are the quick and easy application and removal on different surfaces and materials, the low cost, the slight impact on the texture, and the high alignment precision.
This type of pencil marking could not be suitable for some ethical issues (related to the marking human remains) or for some objects, e.g. for porous and brittle surfaces such as fossils, soft tissue, or very dark specimens and some cultural artefacts may be difficult or impossible to draw, remove, or identify the markers on the dense cloud. Alternatively, it is possible to use structures and features already present on the skull (or the object) as references for positioning the markers without influencing workflow or the result.
The specimens were rotated through 360° and photographed at 20° rotation intervals (Fig. 1b). At each interval, photographs were measured at three different inclinations (0°-30°-45°) (Fig. 1c). To increase accuracy, four physical dimension rulers were placed on the graduated rotating turntable to scale the models. To ensure that the photographs completely covered each sample, two sets of 54 images, in both anatomical position (with the basicranium facing downwards) and inverted position (with the basicranium facing upwards -the calvaria were positioned with the vertex on a plastic ring with a diameter of 15 cm), were taken for each specimen (Fig. 1d-e). This resulted in 108 camera positions taken in six rounds encircling each specimen.
All the tools employed are commonly available and inexpensive: rotation was carried out using a 30.5 cm, 360° turntable typically used for clay sculptures, ceramics, TVs, and monitors. Each 20°, measured manually with a goniometer, was marked and four 10-cm paper rulers with millimetres were positioned on the turntable. Inclination was taken with an analogue clinometer which approached the camera. The photos were taken in a PhotoBox with a dark or blue background and neutral light sources coming only from the top (Table 1 -II).

Photogrammetry workflow
For each sample studied, a digital 3D model was generated using the software Agisoft-MetashapeProfessional Edition Version 1.6.4 developed by the Agisoft LLC (St. Petersburg-Russia) (https:// www. agiso ft. com/). Metashape uses a fourstep workflow: (I) alignment and generation of sparse cloud, (II) dense point cloud construction, (III) mesh construction, and (IV) texture construction. Each step has a set of customisable tools that control the final model appearance.
(I-II) The two sets of images, acquired for chunk 1 (anatomical position) and chunk 2 (inverted position), were initially processed separately. Specific workflow tool settings are outlined in Table 2. 3D model generation, therefore, advanced separately until the dense cloud step (step II) at which point the two newly generated chunks  were cleaned (erasing the overlapping areas of the two chunks and the small artefacts sometimes recorded by the background), aligned, and merged using the marker-based method, which made use of the four previously positioned markers (workflow's sequence 'align chunk' and 'merge chunk') ( Fig. 2c).
The merged dense cloud ( Fig. 2d) was scaled using the four 10-cm rulers as metric reference (reference scale reporting centimetres and millimetres), on each of which the sample size of 10 cm was marked. The reference distances must be representative of the extension of the object. For this reason, for a skull whose order of magnitude is in tens of centimetres, a scale of 10 cm was adopted. The precision of the scaling, which is not affected by the position of the 4 dm, is provided by the comparison between the references (the redundancy allows optimizing the model) and the resolution size in pixels of the photos taken. Scale by markers is a standard sequence in which the operator initiates a process completed and defined by Metashape with an algorithm that involves an automatic camera calibration. This operation combines algorithmic calculations of error and estimation and uses the internal properties of the camera, in particular the measure of the shot and digital pixels (respectively 4.3 mm and 5184 × 3888 pix). The precision is then finally provided by the pixel dimensions which, being less than this threshold, enable the software to estimate 'optical aberrations' of less than 1 mm. The optical aberrations are errors resulting from the focusing parameters (light changes and (III) Before processing the merged dense cloud to create mesh and texture, the rotating platforms containing the metric references that were inevitably recorded and built positioning the tripod 30 cm far from the specimen and setting a focal distance of 24 mm, must necessarily be cleaned. Subsequently, to reconstruct a high-resolution 3D model, the mesh was set up with a 'custom face count' of 2.5 million  (IV) The texturing process was left on default settings with mosaic-blending mode, generic mapping mode, and the 4096 × 1-texture size/count option selected (Fig. 2g). This process superimposes a more precisely shaded and coloured representation of the specimen over the geometric reconstruction.
In this way, it is possible to visualize and export the model in several formats. The time frame necessary to complete all the four steps is closely correlated with the size and number of photos and with the hardware used. The proposed method was fine-tuned to have a model completed in approximately 1 h using medium-high performance hardware based on an I5 CPU-64 bytes -16 GB DDR3 RAM and a graphics card with a GK107 core (384 shader, 980 MHz-64 bytes, and 2 GB DDR3 of video memory) run on Windows10 Professional-Educational Edition (Table 1 -III).

Representation and accuracy
The actual representation and accuracy of the models are closely related to the initial quality of the photos (Fig. 3a-b) and the effectiveness of the photogrammetric workflow adopted.
As for the camera settings and the execution of the photos, the proposed methodology covers the entire surface of the skull with 108 easily acquired shots (high-resolution RAW photos) which are ready to be directly converted into jpg and processed. Regarding the workflow, even though it is always recommended that the quality of the sparse and dense clouds (steps I and II) is set to 'high' for models consisting of more than 100 photos, the manufacturer states that the alignment of the dense cloud set on 'medium' reduces processing time without compromising subsequent steps. In addition, the settings of the last two steps, mesh (which includes scaling) and texture (steps III and IV), have been customized to ensure high levels of accuracy and fidelity of the representation. As introduced in the previous paragraphs, the meshing process (also known as polygonalisation) consists of the partition of the surface of the dense cloud into micro-polygonal cells (Frey and George 2007). Through proper algorithms (Catmull 1974), Metashape build a network that shapes the skull in space. The number of polygons, in which the object is divided, will determine the final resolution of the model. Using the setting of 2.5 million, 'face count' guarantees a well-modelled human skull (Fig. 3c) which, without further simplifications (smoothing process), guarantees high-definition models and an easily manageable final output once exported in different formats (i.e..ply file size is around 50 MB). Finally, texturisation (visual variation of the continuity on a surface) rasterizes all the two-dimensional images over the mesh (Ebert et al. 2003) reproducing and displaying the chromatic variations in reality. Using the software's default settings and the original images as input data, our method captured and reproduced the chromatic details ( Fig. 3d) with high fidelity.

Repeatability of the process
To determine the repeatability of the method, after visual observations, a statistical distribution analysis (Hammer and Harper 2008) of systematic errors (Taylor 1997) involuntarily committed by the software or the model-building protocol was carried out. The systematic errors analysed were the Metashape scale error and the spread between the real values (considered the true values) and the photogrammetric ones. Therefore, the statistical distribution of these sets of data allows the operators to assess if these differences could be considered aleatory (unpredictable and inevitable errors not related to the method or the operator).
After the model is scaled, Metashape estimates the scale error, an inevitable size error computed by the software after the automatic calibration using the input of the physical dimensions of the rulers. Scale error estimates of the differences in the measurements were then automatically generated for each model and represent the spread of the real and virtual data. In detail, scale error is the spread in mm obtained by summing the measurements of the sides of all the polygons that constitute the mesh and dividing the results by the number of the sides of the same polygons. These values evaluate if this type of error is merely aleatory or is related with the protocol.
To check if the two measurement datasets are comparable, a Bland-Altman plot was performed. The Bland-Altman plot (or difference plot) is a data plotting method used to assess the agreement between two different measurement techniques. It displays a dispersion diagram useful for comparing measurements (of identical nature) taken with various instruments. The differences between the model's measurements and physical ones (considered the true measurement) were employed to standardize the method proposed. For this purpose, two datasets were built measuring the following distances three times: Glabella-Opisthocranion (GA-OP), Basion-Bregma (BA-BE), Ectoconchion-Ectoconchion (ECT-ECT), and Prosthion-Nasion (PR-NA) (Buikstra et al. 1994). The means were subsequently analysed as proposed by Bland and Altman (1986). Measurements chosen were the minimum standard required for documentation as part of the National Forensic Database (Moor-Jansen et al. 1994) as reported by Buikstra and Ublelaker (Buikstra et al. 1994) and the points that can be easily identified and measured. Moreover, they cover the main dimensions of the skull and the face, so they are sufficient to evaluate the repeatability of the 3D digitisation protocol.
All measurements were taken three times by each of the three operators (OPR-1, OPR-2, and OPR-3). Physical measurements were taken (up to hundredth of a centimetre) with the same digital calliper and 3D model measurements were taken using the software Landmark 3.6 (https:// www. cs. ucdav is. edu/ ~amenta/ retro def. html). Both the measures were taken to hundredth of a centimetre. All statistical analyses were performed using the software PAST 2.0 (https:// palaeo-elect ronica. org/ 2001_1/ past/ issue1_ 01. htm) and the web tool Huygens-Science BA-plotterR (https:// huyge ns. scien ce. uva. nl/ BA-plott eR/). In addition to the Bland-Altman, the web tool Huygens-Science automatically performs the Coefficient of Reproducibility (also known as Guttman scale -Guttman 1944). The Guttman scale is an ordinal cumulative scale that calculates the degree of deviation of the observed scales and to be accepted errors (the coefficient of reproducibility) must be equal to or less than 10% (White and Saltz 2017). Additionally, to compare the difference between the physical and digital measurements to those between the intraobserver and interobserver error, one-way analysis of variance (or one-way ANOVA) has been performed. Given n groups of data, the ANOVA test allows to determine whether at least one of the n means is statistically different from the others (Rohlf and Sokal 1995). The quadratic means between groups and Levene's test for homogeneity variance based on means for both the intraobserver and interobserver values have been employed.. The quadratic means between groups (Hammer and Harper 2008) is an indicative function of how much variation there is between the data groups available (high values indicate extremely dispersed groups) while the Levene's test for homogeneity variance based on means (Levene 1961) is used to test if the samples have homogeneity of variance (also called equal variance).

Results
Errors in all 3D digitisation methods are inevitable but several studies testing the error show that it is so low that it should not affect results (Mendřický et al. 2016(Mendřický et al. , 2018Sužiedelyte-Visockiene et al. 2019;. Therefore, the proposed method is affected by the limits inherent in all photogrammetric models that have proven to be negligible (Morgan et al. 2019;Lussu and Marini 2020;Katz and Friess 2014;Silvester and Hillson 2020). To support our standardisation goal, statistical distribution analysis was carried out on the scale error and the spread between the measurements taken on the physical skulls and the 3D reconstruction.
As reported in Table 3, the spread of the scale errors is restricted between 0.101 and 0.291 mm with a mean of 0.193 mm. All plots produced adopt a unimodal and normal distribution. The dispersion curve shows a normal (or Gaussian) distribution (Fig. 4a), and the boxplot displays symmetric distribution with the median lying in the middle of the total range of values (Fig. 4b).
Finally, the normal probability plot (comparing the sample distribution with the normal distribution) shows only a few specimens that deviate slightly from normality. The cumulative distribution function of the sample (point) overlaps with the cumulative distribution function for a normal distribution (straight-line). The steeper plot (Fig. 4c) does not present a skewed distribution but rather a mirrored trend and a consequent flattening of the normal probability plot for all values.
Accordingly, based on the results, the scale errors could reasonably be considered aleatory and not related to the photo acquisition protocol or the photogrammetric workflow.
To test that the systematic error is negligible in addition to aleatory, a Bland-Altman plot comparing the spread of the data was also produced for each of the four measurements. The plots include the mean differences for all measurement pairs and the upper and lower bounds display a high agreement between the physical measurements and those made by the model. The spread is − 0.29 cm for GA-OP (Fig. 5a), − 0.27 cm for BA-BE (Fig. 5b), − 0.29 cm for ECT-ECT (Fig. 5c), and − 0.33 cm for PR-NA (Fig. 5c).
Overall, the mean differences ranged from 0.27 to 0.33 cm with a total average of 0.295 cm. Most of the measurements were included within the limits of agreement (only 6/180 were excluded) and few data points were outside of the 95% confidence interval limit (three outside and five borderline). All the coefficients of reproducibility (errors) are under the limit of acceptance of the 10% (3.5% -GA-OP, 3.0% -BA-BE, 4.3% -ECT-ECT, and 3.7% -PR-NA) with a mean of about 3.2%. Both the Bland-Altman plots and the coefficients of reproducibility support the negligibility of the systematic error, the reproducibility of the protocol and as such, these models can be considered an appropriate representation of the (physical) skulls.
To compare the differences between the means of the physical and digital measurements (Online Resource 1) to those between the intraobserver and interobserver, the one-way ANOVA analysis was performed considering as null hypothesis (H 0 ) that the means are all statistically equal. Accepting the H 0 hypothesis, all the means have a high homogeneity of variance. As displayed in the Online Resource 2, Quadratic Menas Indexes, F and p(same) values, all close or equal to , accept the H 0 hypothesis. Online Resource 2 displays as the lows quadratic means indexes and the overall F and p(same) values close or equal to 1 (respectively intraobserver and interobserver differences) accept the H 0 hypothesis. The same Online Resource displays that H 0 hypothesis is also accepted by the Levines' p(same mean) and p(same median) all with the value of 1.
These assumptions both indicate a restricted variation (equal variance) of all intraobserver and interobserver measurements denoting high repeatability of the process.
Further speculations, aimed to test the observer-error variance relative to the variance due to methods (Online Resource 1), are postponed to subsequent more detailed statistical analyses.

Discussion
Descriptive and quantitative virtual anthropology studies require both high-resolution and high precision 3D models so following rigorous standards is fundamental.
In two sections (Photography settings and Photogrammetry workflow), the fundamental aspects of photogrammetry are systematically described to provide a scientific method based on the optimisation of costs and times while maintaining a high level of accuracy.
The novelty of the Photography settings is the small number of photos necessary for a total coverage of the object, compared to other protocols that do not use the markers (Katz and Friess 2014;Morgan et al. 2019), granting ease and speed in the acquisition and the processing by the software. Although a Photogrammetry workflow based on two separated chunks, initially processed separated, was already proved to be valid (Mallison and Wings 2014) in the present study, unlike the other methods do not use the background points as reference for the chunks' alignment (steps I and II) but uses 4 markers directly placed on the specimen. The advantage of this method is a dense cloud quick-high alignment (high speed and accurate) that reduces the timeframe and increases the effectiveness and the representation of the model.
Both the Photography settings and the Photogrammetry workflow create small gaps and artefacts in geometry and structure (no system based on visible light will overcome occlusions entirely). Gaps are lacks of information (geometry and structure) that Metashape is not able to extrapolate from the photo. The software algorithms try to compensate them by retrieving the missing information from the overlapping photos (complementary to each other) and filling the gaps by creating artefacts (arbitrary surfaces) obtained by joining the points of the mesh (through the triangulation process). A high-resolution and a high alignment of the photos, therefore, require only a few accurate compensations (small dimension of the artefacts necessary to complete the surface of the object) ensuring high precision of the model and consequently a real rendering of the surface. Metashape then automatically fills in on the mesh ensuring the scale accuracy under 1 mm previously argued.
Although equipment costs are variable, all the necessary tools are generally inexpensive and readily available. Moreover, the low weight (camera 1.4 kg -tripod 1.2 kg -Pho-toBox, turntable, and other tools, in total, around 1.5 kg) and small size of the tools ease the operational constraints The aim of this research is just focused on the reconstruction of human skulls. As the photograph series forms circles of camera positions around the specimen, by slightly varying the number of circles and the rotation intervals around the target, relating their size, this method can be applied to other skeletal elements and other objects. In this context, we found that as a general rule, smaller and/or thinner objects require fewer circles but smaller rotation intervals and vice versa (e.g. maxilla, vertebrae, coxae, and sacrum can be reconstructed with two circles and photos taken at 10° rotation intervals). However, the overall applicability of this method is restricted to the previously described limitation of closerange photogrammetry. Therefore, for skeletal elements and objects under 6 cm (such as teeth), we suggest using stereophotogrammetry techniques (Piazzesi 1973;Santella and Milner 2017) or aerial photogrammetry for wider context (Nikolakopoulos et al. 2017;Fabris and Pesci 2005). The flexibility of the protocol makes it potentially useful in a wide range of areas such as the preservation and the restoration of the skeletal and archaeological finds (Alliez et al. 2017;Astorqui 1999;Mallison and Wings 2014;Pescarin et al 2013).

Conclusion
Our system showed that close-range photogrammetry based on SfM could be used to produce high-quality models. The fine-tuned protocol proposed is capable of building models suitable for several virtual anthropology research fields.
The key novelties of what is presented here (Photography settings and Photogrammetry workflow) include how fewer numbers of photos are necessary for a total coverage of the skull and the chunks' alignment done using 4 markers directly drawn on the specimen.
The advantages of the methods proposed consist of (1) an easily applicable protocol by users with different backgrounds; (2) equipment costs, availability, ease of storage, and transportability; and (3) the high-fidelity 3D digital models which are generated in approximately 1 h.
Realistic and millimetric models (high-resolution models) were produced using a scientific method based on modern techniques that with 'minimal effort' ensure: • A photo acquisition protocol and a photogrammetric workflow, which are easily utilized by a wide range of users. • Complete skull coverage with only two sets of 54 photos (108 in total). • Final model output accuracy, which can be used for conducting several osteometric analyses, is obtained by manipulating only a few of Agisoft-Metashape main settings. • Inexpensive, readily available, and transportable (low weight and small size) necessary tools that ease the operational constraints both inside and outside of a laboratory.
Future developments using this method modulated on other skeletal elements and other objects will provide great benefits that can be applied to a wide framework of issues such as the preservation and restoration of skeletal and archaeological finds.