Low-cost sensors for rapid mapping of cultural heritage: first tests using a COTS Steadicamera

One of the last and more pressing requests to the researchers working in the field of Geomatics is to research, validate, and propose new strategies for the rapid mapping of different contexts, with low-cost solutions. The continuous implementation of image-matching algorithms and their use in structure from motion (SfM) software allow using new sensors and implementing new strategies for the production of 3D models starting from an image-based approach. In the last years, another central issue for the researchers has been related with the documentation of cultural heritage (CH) artifacts using different sensors and techniques. In the experience presented in this paper, the attention was focused on these two central aspects: a test of a commercial off-the-shelf (COTS) steadicamera for the rapid 3D documentation of two cultural heritage artifacts was proposed. The two sites chosen to evaluate this mass market sensor were as follows: the Basilica of San Nicola (Tolentino, MC, Marche Region, Central Italy) and the Castello del Valentino (Torino, TO, Piedmont Region, North Italy). The metric products obtained with the Steadicam were compared with more consolidated techniques such as close-range photogrammetry (CRP) and terrestrial laser scanner (TLS). The products derived from the different techniques were then evaluated and compared, and an overall assess of the use of this new solution was made.


Introduction
In the last years, a great number of works in the field of Geomatics have focused their attention on the use of low-cost and commercial sensors for the rapid mapping in different scenarios (Kolev et al. 2014;Micheletti et al. 2014;Nocerinoetal. 2017).In Italy, due to the recent seismic events, the attention was focused also in post disaster scenarios, were the contribution of geomatics could become imperative.The researchers are nowadays experimenting different solutions based on several techniques.Simultaneous localization and mapping (SLAM) portable LiDAR and photogrammetric approaches seem to be the most promising for future developments (Brocchini et al. 2017;Chi a br an doeta l .2017b,c; Micheletti et al. 2014).The key concepts of the range/image-based sensors are as follows: the rapidity of usage and the low-cost aspect in the 3SLAM technique is based on an algorithm that using the information derived from a sensor (often Lidar or imagery) is able to compute the position of the device in the space.Thanks to this technology, new interesting tools for the rapid mapping are available on the market, although the costs still do not allow it to be defined as market accessible.Some experiences (Bosse et al. 2012;Brocchini et al. 2017;Chiabrando et al. 2017b, c) have investigated 3D documentation of different scenarios using SLAM technologies and compared the results with other consolidated methodologies.In the scenario of image-based techniques, a big acceleration of the whole process has been facilitated by two different factors: the introduction on the markets of new low cost and high performances sensors (smartphones, tablet, and integrated cameras that are more and more performing) and the continuous implementation of the image matching algorithms t h a ta r ee m b e d d e di nt h es o f t w a r eu s i n ga nS f Ma p p r o a c h (Harwin and Lucieer 2012; Koutsoudis et al. 2014;Remondino et al. 2014;Westoby et al. 2012).The use of steadicams is now widespread, mainly for video and photo documentation, both for expert and unprofessional purposes.The research experience on this subject is not exhaustive, but multi-sensor devices with the stabilization standard have been developed in environmental context applications, as the agriculture ones (Das et al. 2015).
The possibility to obtain different metric products thanks to a SfM approach based on video frames extraction was stressed in this research, using a new COTS Steadicam.More consolidated sensors, terrestrial laser scanning (TLS) and digital single-lens reflex (DSLR), were used as ground truth elements.
Image-based solutions for cultural heritage documentation: low-cost sensors and rapid mapping As already stated, lot of researchers focused their attention on the use of mass market sensors combined with an image-based approach.The use of these low-cost sensors such as smartphones, tablets, and action cams for rapid mapping is becoming more and more common, and getting metric data becomes extremely fast.Since the acquisition time is very short and the technologies are low cost, the strategies are applicable to different contexts, as the risk areas ones.
In this sense, the active and central role of Geomatics in the documentation of cultural heritage context has been established since many years.The principles that drive the works of documentation are already well known by the experts of the field: the documentation has to be continuous; it should go with every stage of life of the heritage, and it has to be foreseen for future applications.Such principles are contained in ICOMOS charters, particularly in Sofia Charter, 1996, re-affirmed by European Landscape Convention of the Council of Europe, 2000; with the CIPA 3 × 3 rules in their 2013 last update, the will of optimize the whole process of mandatory documentation of CH has been reaffirmed (Waldhäusl et al. 2013).Moreover, thanks to the technological improvements described above is nowadays possible to reach good levels of sustainability, in the direction of quality of information and detail, both in terms of time and cost.Furthermore, when working on cultural contexts is necessary   also to consider the importance of their spatial and temporal complexity.Considered all these reasons, several researchers in geomatics have recently begun to approach these themes with a new perspective and trying to improve and adapt themselves to the different needs of sites (Balletti et al. 2015;Gruen 2009;Lerma et al. 2011).As is well known, the choice of specific approaches with dedicated sensors leads to related specific outputs in terms of the level of detail requested and of resources involved.The focus on technologies dedicated to the rapid mapping of cultural heritage has been studied by several researchers (D'Annibale et al. 2013;Fiorillo et al. 2016;Kwiatek and Tokarczyk 2015;Pérez Ramos and Robleda Prieto 2016) while (as previously reported) the use of Steadicam for these purposes is still a poorly researched topic.

Sensors
As already stated, different sensors, depending on the techniques adopted, were employed: a TLS, two different DSLR, and a steadicam.

Steadicam
The DJI OSMO+ steadicam (Dà-Jiāng Innovations Science and Technology Co. 2017) was tested in order to evaluate the metric and radiometric quality of the 3D photogrammetric model.A detailed description of the steadicam is reported below and specification in Table 1.OSMO+ (Fig. 1)i sa handheld camcorder manufactured by DJI and released in October 2015.OSMO+ equipped by X3/FC350H camera is capable of recording 4 K videos (max 30 frames/s) in MP4 or MOV and capturing photos at 12 Mpx (max 4096x2160px) in Adobe DNG RAW or JPEG formats.Furthermore, it is also possible to upgrade the sensor with other cameras.The sensor is mounted on a 2-axis gimbal that stabilize the device (it is possible to assemble an additional gimbal for Z-axis), cushioning the movements of the operator's arm, providing smooth videos without shaky frame movements.On the left of the handle, an adjustable support can house a smartphone providing, through a dedicated app for iOS or Android operating systems, the control system of the device, as well as the screen on which to see what the camera is framing.

DSLR camera
Furthermore, a more traditional close-range photogrammetric approach based on digital camera was performed.Two different DSLR cameras were used for the surveys; their specifications are shown in Table 2.

Laser scanner
A TLS acquisition was performed using a time-of-flight (ToF) laser scanner Faro Focus 3D by CAM2 (Table 3), in order to obtain a ground-truth metric model.

The case studies
In this first phase, the multi-sensor approach, finalized to the evaluation of the photogrammetric products by steadicam, was tested mainly on two cultural heritage artifacts, which are specifically chosen for comparing similarities and  differences: The Castello del Valentino (Torino, TO, North Italy) and The Basilica of San Nicola (Tolentino, MC, Central Italy), both located in Italy.Both the sites that were chosen as case studies possess a high cultural value and peculiar features.Furthermore, they can both be included, for different reasons, in that part of cultural heritage subjected to several types of risks.

The Valentino castle cabinet (A)
The Castello del Valentino (case study A) in Fig. 2 is part of the Royal Residences of the Savoy House in the territory of Piedmont.The castle is nowadays hosting the Faculty of Architecture of the Politecnico di Torino, several conferences and different public events organized for the local community of citizens; these factors plus the inevitable decay of historical structures allow to insert the castle in the list of sites in a condition of middle risk (Fig. 3).Moreover, the castle is located near the Po River (the longest Italian river) and is in an area subjected to potential floods.The Castello del Valentino starting from the 1977 is also part of the list of World Heritage Sites of UNESCO.The castle was built in the sixteenth century but reached its actual conformation only one century later, when important works of restoration and expansion were ordered to Carlo and Amedeo di Castellamonte from Maria Cristina di Borbone.The castle was partially restored in the second half of the nineteenth century.During the centuries, these important buildings were designated to different functions (residence for noble families, veterinary school, and barracks) until 1906 when they reached its actual destination.
Among the different indoor rooms that compose the castle, the Gabinetto dei Fiori Indorato was chosen to be the object of the different acquisitions due to its limited dimensions and its easiness of access.The cabinet of the Castello del Valentino, with its pavilion vault (Fig. 2), is characterized by an abundance of stucco-work and an absence of frescos if compared with the other rooms of the royal apartment.The stucco-work is composed by interlacing plants, branches, and acanthus scrolls, enriched by a bloom of flowers and roses gently protruding from the ceiling.It was realized by Pompeo and Francesco Bianchi in the 1642, probably with the collaboration of Carlo Solaro and Giovanni Casasopra.

The San Nicola Church (B)
The building history of the Basilica di San Nicola in Fig. 4 (case study B) is tied with the Italian Saint BSan Nicola^that have lived in Tolentino in the second half of the thirteenth century A.D. and that was buried in this ecclesiastic complex.The structure of the basilica was mainly modified between the fourteenth and seventeenth centuries A.D. and nowadays is one of the most important religious and cultural complexes of Central Italy.The whole city of Tolentino and the basilica were slightly damaged by the seismic wave that hit this area of Italy starting from the 24 August 2016, and the religious  complex is still closed to the public due to security reason, damage assessment, and restorations (Fig. 5).
The coffered ceiling (Fig. 4), that has been the area of the religious complex chosen to be surveyed, was ordered by the bishop Giambattista Visconti, and its realization ended in the 1628 A.D. after 23 years of work.It is composed by 21 coffers, divided in 7 rows, and hosts the wooden statues of different saints and characters of the gospels (Mariano 2008).

Data acquisition and processing
As presented in previous paragraphs, the two case studies present at the same time similar characteristics and own peculiar architectural attributes.The main nave of San Nicola church is 13.5 × 38.5 m large (almost 520 m 2 ); the cabinet is 5 × 4 m large (almost 20 m 2 ).The main difference between these two elements is related to the distance of the ceiling from The examination focuses on the ceilings of the two indoor environments: both are richly decorated surfaces, with different materials and techniques decoration yet.The cabinet is decorated with painted stuccos (gold and blue) while the coffered ceiling of San Nicola is made with lacquered woods.Obviously, the two materials reacted differently to the lights and that is a factor that we need to consider during the acquisitions.More specifically, San Nicola was poorly lightened by natural light (so we need to use the illumination system of the church) while for the cabinet, we did not need to use artificial light, thanks to the presence of a lot of natural light.All the factors cited above were considered during the acquisition phase, and consequently, we projected the geometry of the different acquisitions.All the acquisitions were performed without the use of aerial structures such as scaffoldings.The entire datasets derived from the three sensors were processed using standard workflows.
A synthesis of the time factor in the process of data acquisition and processing is shown in Table 4.
First of all, to make the description more schematic, the laser scanning acquisition will be presented for both the objects, and then a more in-depth section will be dedicated to the two photogrammetric approaches that have been tested on the two cases study.
(L) Laser scanning The TLS acquisitions were projected to reach a good coverage (in terms of quantity and quality of information) for both the objects surveyed.To cover the main nave of San Nicola church n°30 scans were realized (Fig. 6, red circle); for the cabinet n°5 scans were realized.The scans from TLS were registered using the software Faro SCENE and through a consolidated process; first, the scans were registered using a cloud-to-cloud approach, and in a second step, they were georeferenced through some control points acquired  5 and 6. (S) Steadicam photogrammetry The OSMO acquisitions were performed blocking the Z-axis of the gimbal, with the camera axis parallel to the ceiling and recording a 4-k video at 24 fps (frame per second).Blocking the axis of the gimbal allowed us to stabilize the camera and perform the video acquisition while walking under the ceiling.No oblique video/images were acquired with the OSMO because we want to analyze the results of the most rapid and standard acquisition (nadir) in a grid strip configuration.From the video, a set of frames was extracted to perform the SfM photogrammetric workflow.For the (A) test, the cabinet, the parameter of extraction is 2 frames/s while for the (B) case, frames were extracted at the ratio of 1 frame/s.
(C) Digital camera photogrammetry The tradition photogrammetric acquisitions were realized using a photographic tripod to minimize vibrations and consequently reduce the noise of images.The use of tripod was particular important in the case of San Nicola because of the bad lighting conditions.For both the objects, several stripes of images were realized, both with  the orientation of the lens axis of camera parallel to the ceiling and both using an inclination of ≈ 45° (Chiabrando et al. 2017a, d) to obtain a more complete reconstruction of the object.The photos were shoot in both RAW and JPG format to perform a radiometric correction before the processing.Below the comparison of data acquisition for the two photogrammetric approaches (S) and (C) in parallel for each case study (A) and (B) (Figs. 7 and 8), together with elaboration results and metric control on the 3D models performed by topographic control by total station measurements on markers and natural points as ground control points (GCPs) and check points (CPs).

The Valentino castle cabinet (A)
The San Nicola Church (B) Summarizing the results about the metric control performed on the two photogrammetric models organized in Tables 7 and  8, the metric resolution and the RMS error con GCPs are reported in graphs below (Tables 9 and 10).The appreciable resolution of the photogrammetric model by steadicam, in Table 9, is strictly related to the geometry of the room and thus of the acquisition one.The model of the cabinet returns a resolution of 1 mm/px (about twice the model produced by reflex camera, 0.5 mm/px) whereas the ground sample distance (GSD) for the coffered ceiling of the basilica is 4.2 mm/px (about only 1.2 times the reflex camera model, 3.3 mm/px), due to both the greatest shooting distance and the different sensor features.
The RMS errors results on CPs (Tables 7 and 8)andGCPs synthesis (Table 10) indicate a good performance of the photogrammetric approach based on steadicam: 3.2 mm for the cabinet (1:8 ratio with the camera model control) and almost 1 cm for the coffered ceiling of the Basilica (1:2 ratio with the camera).All this however is relational to a good architectonic representation scale, up to 1:20 as expected for the digital camera model and almost 1:50 for the steadicam model.

Multi-scale evaluation of the 3D models: geometric and radiometric comparison
A set of focused analysis are now listed below with the aim of organize a multi-level comparison and weighing of the sensors, especially for the two photogrammetric approaches: the more traditional one with DSLR camera and the experimental one with video frame by Osmo+ steadicam, in comparison with the laser scanner point cloud model.Two levels of investigation are conducted: the first on a sample area of the two ceilings and the second on a small detail of the decoration.

Orthoimage and digital model
A rigorous comparison of the surfaces from a geometric and radiometric point of view is now carried out first of all in order to test the effective capability of a high-detailed documentation of complex architectonic surfaces.The two sensors are called upon to depict thin decorative details as well as, in case of risk contexts and damaging events, a decay mapping and represent and measure potential cracking traces.The orthoimages and the 3D models (Raster Digital Elevation Model and Triangulated Model), generated using the commercial software PhotoScan by Agisoft, are going to be considered below.
Pointing out again the GSD of the two models is clearly visible in Table 9 the different resolution ratio depending both from the shooting distance and from the sensor size.Nevertheless, the Fig. 9 confirms a good level of detail reached by the steadicam models against the reflex camera ones, that is very satisfactory for mapping and measure in any case a great number of information on the affected surface.A specific assessment on the triangulated mesh and on the DEM is reported below.In Fig. 10, transversal sections, made using the software 3DReshaper, operated in the high-definition triangulated surface.The three profiles corresponding to the sensor are pointed out.
According to the DEM resolution for the Z value, the assessment has been carried out using the software QGIS on a set of control points statistically considered, all together and clustered according to their spatial distribution on the threedimensionality of the surfaces.Some reflections must be conducted about these results, which display different performances according to the criticality of the surface on which they have been picked up.The main problem on the cabinet vault in the Valentino Castle (Fig. 11 and Table 11) is the combination of the nadirshooting configuration, with the projecting rays of the cameras very oblique according to the base curvature (points A) of the pavilion vault and more orthogonal according to the central area of the vault (points B).The wooden coffered ceiling in Tolentino (Fig. 12 and Table 12)h a sas t r o n gt h r e edimensionality (almost 1 m depth); the points B area are the ones distributed on the recesses of the paneled ceiling with the decorative apparatus of saints.The points C, at the same time, are distributed on the cruciform decorative wooden profiles of the ceilings with composite engravings.

Asamplearea
Now, a test area has been selected in both the cases study, corresponding to a transversal slice of the ceiling: a module made up of three caissons, for the Basilica, and a central portion of the pavilion vault of the cabinet.In this step of the analytical comparison, the point clouds originating from the three sensors have been considered, taking into consideration many geometric features of them: n°of points (in the entire sample and the one contained in a 10-cm slice, in order to absolutize the comparison); density, roughness.Moreover, a cloud-to-cloud absolute distance computation was performed using the software CloudCompare: the Osmo+ steadicam 3D model with the reflex cloud and with the LiDAR cloud.The pavilion vault of the cabinet The setting up of the documentation project in such small environment with good light condition lead to expected excellent results of metric and radiometric resolution concerning the more traditional sensors, as presented in paragraph before.However, here is extended and proved the good feedback of the Osmo+ point cloud resolution and accuracy previously tackled yet, and it can be defined as very competitive in terms of geometry, density, and roughness, despite the higher detail of the LiDAR one (Table 13).In the comparison between sensors, St. dev.from models are within the centimeter (Table 14,Fig.13).

The coffered ceiling modules
In case of big scale change, as in Tolentino Basilica, where spaces are larger 30:1 ratio and ceiling distance is tripled, the metric accuracy, as reported before, is lower together with the density of information on the model as well, even though it remains corresponding to the architectonic scale especially toward the Osmo+ model.The surface definition is penalized by shooting distance and strong three-dimensionality of the caissons undercuts and suffers from the lighting condition (Tables 15 and 16).
The assessment between models operated in the same test area along the caissons bas-reliefs allows evaluating the Osmo+ photogrammetric point cloud as average well detailed, in deficit in case of compared to others consolidate sensors as well (1:6 ratio with LiDAR and 1:4 ratio with Reflex camera), although definitely less affected by roughness (Table 16).However, the discrepancy in terms of absolute distances is not uniform, distributed across the whole area.In case of comparison with LiDAR and DSLR camera, the densification suffers in terms of precision in the ceiling limits, in the grafts on the wall (Fig. 14).

A focus on a decorative detail
Two details of comparable area about 1 m 2 ,are paralleled in order to examine possible level of detail reached by Osmo+ in case that the documentation needs would be the expected rapid mapping of damages and cracks, where metric/radiometric information are equally significant (Figs. 15 and 16 and Tables 17 and 18).

The stucco-work floral ornament
The crowned snake

Conclusion
The aim of the research is to evaluate the Osmo+ steadicam as low-cost sensor for metric applications and rapid 3D mapping for photogrammetry aims.Therefore, a final assessment is now carried out downstream the analytical workflow presented in this contribution.The characteristic features together with strengths and weakness of the photogrammetric technique by Osmo+ steadicam device are listed below are synthetized in a tabular overview (Table 19).
Surely, the rapidity in image/video acquisition in 4 k, the functionality and user-friendliness of the Osmo+, a lightweight and handily device, are the most significant aspect that define the steadicam as an effective solution for rapid mapping of critical environments, regardless of the survey scale.The chance to own one-hand-free to use along walking, the other one for side tasks is not a secondary attribute.The shooting distance and the lighting condition, however, are important factors that significantly influence the quality of 3D models, especially in big complex spaces, or outdoor because of the overexposure in outdoors applications.On the contrary, the reduced angle of sight, the short battery life, and the gimbal stability that often needs recalibration are crucial aspect, which the operator has to take into consideration in a successful operative work.Nevertheless, the price is very competitive (600-1000$ for the base configuration or accessories, up to 2000-3000$ for the extreme stabilization with full accessorizes pack).
As far as functionality in acquisition and data processing is concerned, the reached detail is very competitive compared to the DSLR camera, as well as the metric accuracy.Despite these aspect, there is to consider the context of use.In case of emergency or middle risk, the importance of time-cost ratio and metric accuracy is important together with the operator's occupancy of the spaces.The metric and radiometric contribution to the 3D documentation of local condition is crucial, and the use of a steadicam is resulted suitable for expectations and much more than acceptable in the final evaluation.

Fig. 1
Fig. 1 DJI OSMO+ with a mounted smartphone (left) and detail of the handle controls

Fig. 6
Fig.6The cases study A (top, the cabinet) and B (down, the church).In red the positions of the scans, in green the positions of the transversal sections referred to Fig.10

Fig. 8
Fig. 8 Zoomed images of the ceiling in San Nicola church: (left) reflex camera; (right) steadicam frame

Fig. 9
Fig. 9 Zoomed images on a decorative detail of the two ceilings (A, top), and (B, right column), comparing digital camera and steadicam, with the display of the textured and not textured 3D model

Fig. 11
Fig. 11 Test case (a), orthoimage and DEM in which a set of control points are distributed

Fig. 13
Fig. 13 Test case (a): cloud-to-cloud distance computation and histograms plotting the Gaussian of absolute distances (meters) between point clouds.(top) steadicam VS DSLR and histogram on left; (down) steadicam VS LiDAR and histogram on right

Fig. 14
Fig. 14 Test-case (b): C2C distance computation and histograms plotting the Gaussian of computed absolute distances (meters) between point clouds.(top) steadicam VS DSLR and histogram on left; (down) steadicam VS LiDAR and histogram on right

Fig. 15
Fig. 15 Test case (a) detail: roughness index maps on the three sensors.From top: LiDAR, reflex camera, steadicam.Higher level of roughness index is to correlate to the highest level of detail of the 3D surface, corresponding to that of the most raised areas of the point cloud: the Osmo+ one the least rough but undoubtedly the most smoothed one

Fig. 16
Fig. 16 The (b) detail: roughness index.From top: LiDAR, reflex camera, Osmo+.The photogrammetric densification suffers from bad lighting conditions than the LiDAR one, although the dense roughness index

Table 1
Main specs of the steadicam

Table 2
Specifications of the two cameras

Table 3
Specifications of the scanner Risks of ordinary type related to the everyday usage of the Valentino Castle for academic activities of Politecnico di Torino and numerous events that take place in the Castle.Source: www.lastampa.itand www.polito.it

Table 4
Specifications about data acquisition and processing time

Table 5
Scan specifications

Table 8
Test (B), the coffered ceiling in San Nicola church: (S) Osmo + steadicam and (C) Canon Camera datasets, processing and metric control with RMSE (Root mean square error).

Table 7
Test (A), the Valentino castle cabinet: (S) Osmo+ steadicam and (C) Canon Camera datasets, processing and metric control with RMSE (Root mean square error).

Table 10
Synthesis from Table x and x of RMS error on GCPs [mm] for both the cases study using both photogrammetric approachesTable9Synthesis of GSD [mm/px] of the 3D models for both the cases study using both photogrammetric approaches

Table 12
Statistic values on control points and combined according to their distribution on the coffered ceiling

Table 14
Statistical synthesis of the cloud-to-cloud absolute distance computationon(A) Mean St. dev.mMSteadicamVS digital camera 0.0028 0.0021 0.0000 0.0932 Steadicam VS LiDAR 0.0022 0.0018 0.0000 0.0917Table13Analysis on the sample area of the (A) case study according to the three sensors

Table 15
Analysis on the sample area of the (B) case study according to the three sensors Density(pt/V.spherer =0.01)

Table 16
Statistical synthesis of the cloud-to-cloud absolute distance computationon(B)

Table 18
Roughness index on detail in case (B)

Table 19
Synthetic overview of many factors involved in a global evaluation of operational efficiency in use and metric accuracy of results about the tested COTS steadicam together with the other tested sensor