A Structure-from-Motion Pipeline for Topographic Reconstructions Using Unmanned Aerial Vehicles and Open Source Software
- 701 Downloads
In recent years, the generation of accurate topographic reconstructions has found applications ranging from geomorphic sciences to remote sensing and urban planning, among others. The production of high resolution, high-quality digital elevation models (DEMs) requires a significant investment in personnel time, hardware, and software. Photogrammetry offers clear advantages over other methods of collecting geomatic information. Airborne cameras can cover large areas more quickly than ground survey techniques, and the generated Photogrammetry-based DEMs often have higher resolution than models produced with other remote sensing methods such as LIDAR (Laser Imaging Detection and Ranging) or RADAR (radar detection and ranging).
In this work, we introduce a Structure from Motion (SfM) pipeline using Unmanned Aerial Vehicles (UAVs) for generating DEMs for performing topographic reconstructions and assessing the microtopography of a terrain. SfM is a computer vision technique that consists in estimating the 3D coordinates of many points in a scene using two or more 2D images acquired from different positions. By identifying common points in the images both the camera position (motion) and the 3D locations of the points (structure) are obtained. The output from an SfM stage is a sparse point cloud in a local XYZ coordinate system. We edit the obtained point in MeshLab to remove unwanted points, such as those from vehicles, roofs, and vegetation. We scale the XYZ point clouds using Ground Control Points (GCP) and GPS information. This process enables georeferenced metric measurements. For the experimental verification, we reconstructed a terrain suitable for subsequent analysis using GIS software. Encouraging results show that our approach is highly cost-effective, providing a means for generating high-quality, low-cost DEMs.
KeywordsGeomatics Structure from Motion Open source software
The digital elevation model (DEM) is a three-dimensional visual representation of a terrestrial zone topography and is a commonly used geomatics tool for different land analysis properties such as slopes, height, curvature, among others. There are different technologies for the generation of DEMs, which include LIDAR (Laser Imaging Detection and Ranging), RADAR (Radio Detection and Ranging) or the conventional theodolites . However, these techniques often do not offer enough spatial resolution to recover the terrain microtopography. On the one hand, it is frequently difficult to accurately measure the intricate drain networks with conventional techniques because they are within the measurement resolution. On the other, they are also difficult to measure in the field due to access limitations. As an alternative to these methods, Unmanned Aerial Vehicles (UAVs) equipped with high-resolution cameras have recently attracted the attention of researchers . The UAVs acquire many images of an area of interest and using stereo-photogrammetry techniques generate a terrain point cloud. This point cloud represents an accurate terrestrial zone DEM . However, UAV operation for precise digital terrain model estimation requires certain flight parameters, captured images characteristics, among other aspects .
Stereo-photogrammetry techniques consist in estimating 3D coordinates of several points in a scene using two or more 2D images taken from different positions. Within these images common points are identified, that is, the same physical object point as seen in different images. Then a line-of-sight or ray is constructed from the camera location to the detected object point. Finally, the intersection between these rays is calculated, being this process known as triangulation, which yields the three-dimensional location of the physical point. By doing the above for a significant number of points in the scene, it is possible to obtain a point cloud in the three-dimensional space which is representative of the object or the surface.
To obtain a point cloud or to recover structure, correct correspondences between different images should be obtained, but often incorrect matches appear. The triangulation fails for incorrect matches. Therefore, it is often carried out in a robust approach. Recently, photogrammetric methodologies have been proposed to address the robust estimation problem of structure from multi-views, such as Structure from Motion (SfM)  and Multi-View Stereo (MVS) . On the one hand, SfM is a methodology that, using a single camera that moves in space, allows us to recover both the position and orientation of the camera (motion) and the 3D location of the points seen in different views (structure). On the other, MVS allows us to densify the point cloud obtained with SfM.
Nowadays there are several commercial software like Agisoft  or Pix4D  that allow obtaining dense 3D points clouds. However, being closed code applications they do not favor research reproducibility, and the code cannot be modified. In this paper, we propose a processing tool or reconstruction pipeline for software-based geomatics applications and open source libraries. The pipeline is mainly based on the OpenSfM  and OpenDroneMap  libraries.
The camera calibration stage is an optional step in the proposed methodology. For this reason, we show this block in a dotted line in Fig. 2. We carried out the camera calibration with the OpenCV library  for estimating the intrinsic camera parameters for the drone camera.
The first stage consists in setting the flight path for the drone to carry out the image acquisition. We used the Altizure application  to set a flight strategy. The second stage is about performing the 3D reconstruction process. This stage is based mainly on the SfM photogrammetric technique implemented with the OpenSfM library that produces a scene point cloud. If it is required, we can edit the obtained point cloud in MeshLab  an open source system for processing and editing 3D triangular meshes and point clouds.
2.1 OpenCV Stage: Camera Calibration
The camera position and orientation, or the camera extrinsic parameters, are computed in the SfM pipeline. Therefore, only the intrinsic parameters have to be known before the reconstruction process. Usually, calibration of aerial cameras is performed in the . For this work, the camera calibration was carried out with OpenCV by acquiring images from a flat black and white chessboard. The intrinsic parameters required by OpenSfM are the focal ratio and the radial distortion parameters \(k_1\) and \(k_2\). The focal ratio is the ratio between the focal length in millimeters and the camera sensor width also in millimeters.
This calibration process is optional because OpenSfM gives us the possibility to use the values stored in the EXIF information for the images. These parameters can be optimized during the reconstruction process.
2.2 Altizure Stage: Image Acquisition
Three-dimensional reconstruction algorithms require images of the object or scene of interest acquired from different positions. There has to be an overlap between the acquired images to be able to reconstruct an area. Any specific region must be observable in at least three images to be reconstructed .
Usually, image-based surveying with an airborne camera requires a flight mission which is often planned with dedicated software . In this work, we used Altizure , a free mobile application that allows us to design flight paths specified in a satellite view based on Google Maps  as shown in Fig. 3. Further, with this application, we can adjust specific parameters such as flight height, camera angle and forward and side overlap percent between images.
2.3 OpenSfM Stage: Pipeline Reconstruction
The following stage is the 3D reconstruction process which we implemented with the OpenSfM library. This library is based on the SfM and MVS techniques. In Fig. 4 we show a workflow diagram for the 3D reconstruction process. First, the algorithm searches for features on the input images. A feature is an image pattern that stands out from its surrounding area, and it is likely to be identifiable in other images . The following step is to find point correspondences between the images. Finally, the SfM technique uses the matched points to compute both the camera orientation and the 3D structure of the object.
Feature Detection and Matching. The search for characteristics or feature detection consists in calculating distinctive points of interest in an image which are readily identifiable in another image of the same scene. The feature detection process should be repeatable so that the same features are found in different photographs of the same object. Moreover, the detected features should be unique, so that they can be told apart from each other .
The detector used with the OpenSfM library is the HAHOG (the combination of Hessian Affine feature point detector and HOG descriptor), but apart from this, we have the AKAZE, SURF, SIFT and ORB detectors available . These detectors calculate features descriptors that are invariant to scale or rotation. This property enables matching features, regardless of orientation or scale. In Fig. 5a we show with red marks the detected features in for a given image.
This approach is implemented in OpenSfM with the incremental reconstruction pipeline that consists of performing an initial reconstruction with only two views and then enlarging this initial point cloud by adding other views until all have been included. This process yields a point cloud like the one shown in Fig. 6a.
3 OpenDroneMap (ODM) Stage: Post-processing
With the sparse and dense reconstruction generated with OpenSfM in PLY format, we use OpenDroneMap to post-process the point cloud. The post-processing consists in generating a geo-referenced point cloud in LAS format, a geo-referenced 3D textured mesh (Fig. 8a) and an orthophoto mosaic in GeoTiff format (Fig. 8b).
The LAS format is a standard binary format for the storage of LIDAR data, and it is the most common format for exchanging point cloud. At the end of the OpenSfM process, the sparse and dense point cloud is generated in PLY format with geo-referenced coordinates. OpenDroneMap converts these files in a geo-referenced point cloud in LAS format, which can be used in other GIS software for visualization or ground analysis.
The 3D textured mesh is a surface representation of the terrain that consists of vertices, edges, faces and the texture from the input images that is projected on it. ODM create a triangulated mesh using the Poisson algorithm. It consists in using all the points of the dense point cloud and its respective normal vectors from the PLY file to interpolate a surface model generating a welded manifold mesh on the form of a PLY file. Finally, the texture from the input images is projected on the mesh generating a 3D textured mesh in OBJ format.
4 Ground Analysis
The study area is an area with little vegetation which has relatively flat regions and others with a significant slope. The photographs acquired from this study area were processed as explained in the previous sections. We obtained different 3D models.
From the LAS file information, we generated a terrain digital elevation model (DEM) shown in Fig. 9. In this model, we can see the different height levels from the lowest (blue) to the highest (red). We have in total nine elevation levels each with a different color. In the figure, we can also see that the lower zone is a relatively flat because most of the area has only the color blue. In the part where there is a steep slope, we see that there are different height levels shown in different colors. This, in fact, shows that the area is not flat. In the upper zone the orange and red regions, we see that there is a flat zone in orange which is a narrow dirt road (which we can see in Fig. 8b of the orthophoto or Fig. 8a of the textured mesh). In red, we detect trees that represent the highest elements in the reconstructed area of interest.
Using the LAS file and the orthophotography obtained with ODM we generate terrain contour lines (Fig. 10) placing the DEM on top of the orthophoto. In this figure we can see that there are many contour lines close to each other in the part of the terrain with the steepest slope with respect to other zones, it is mainly because this area is not flat and the height change faster. Contrarily, since the road is slightly flat, the contour lines on it are more separated from each other.
In this work, we have shown a methodology for 3D terrain reconstruction based entirely on open source software. The georeferenced point clouds, the digital elevation models and the orthophotographs resulting from the proposed processing pipeline can be used in different geomatics and terrain analysis software to generate contour lines and, for instance, to perform surface runoff analysis. Therefore, the combination of open source software with unmanned aerial vehicles is a powerful and inexpensive tool for geomatic applications.
In the bundle adjustment process discussed in Sect. 2.3 given by Eq. (3), using only matched points from images is not possible to reconstruct the scene in a real scale. This restriction is why it is necessary to give additional information that can be used as initialization for the optimization process to recover the scale. This information can be an approximated position of the camera or the world position of specific points known as Ground Control Points (GCP). In our reconstruction process, we did not use GCPs, only the GPS position of camera measured by the drone was used as initialization of camera pose. This measurement is not highly accurate. As future work, we want to use GCPs in addition to camera GPS position to compare both reconstructions and compare for an elevation error.
This work has been partly funded by Universidad Tecnológica de Bolívar project (FI2006T2001). E. Sierra thanks Universidad Tecnológica de Bolívar for a Masters degree scholarship.
- 1.Nelson, A., Reuter, H., Gessler, P.: DEM production methods and sources. Dev. Soil Sci. 33, 65–85 (2009)Google Scholar
- 4.James, M., Robson, S.: Straightforward reconstruction of 3D surfaces and topography with a camera: accuracy and geoscience application. J. Geophys. Res. Earth Surf. 117(F3) (2012)Google Scholar
- 6.Goesele, M., Curless, B., Seitz, S.M.: Multi-view stereo revisited. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2009, pp. 2402–2409. IEEE (2006)Google Scholar
- 7.Agisoft photoscan professional. http://www.agisoft.com/downloads/installer/
- 8.Pix4D. https://pix4d.com/
- 9.Mapillary: OpenSfM. https://github.com/mapillary/OpenSfM
- 10.OpenDroneMap. https://github.com/OpenDroneMap/OpenDroneMap
- 11.Bradski, G., Kaehler, A.: OpenCV. Dr. Dobb’s J. Softw. Tools. 3 (2000)Google Scholar
- 12.Altizure. https://www.altizure.com
- 13.Cignoni, P., Callieri, M., Corsini, M., Dellepiane, M., Ganovelli, F., Ranzuglia, G.: MeshLab: an open-source mesh processing tool. In: Eurographics Italian Chapter Conference, vol. 2008, pp. 129–136 (2008)Google Scholar
- 14.Duane, C.B.: Close-range camera calibration. Photogram. Eng 37(8), 855–866 (1971)Google Scholar
- 15.Google maps. https://maps.google.com
- 17.Bolick, L., Harguess, J.: A study of the effects of degraded imagery on tactical 3D model generation using structure-from-motion. In: Airborne Intelligence, Surveillance, Reconnaissance (ISR) Systems and Applications XIII, vol. 9828, p. 98280F. International Society for Optics and Photonics (2016)Google Scholar
- 18.Grauman, K., Leibe, B.: Visual object recognition. In: Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 5, no. 2, pp. 1–181 (2011)Google Scholar
- 20.Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. VISAPP (1) 2(331–340), 2 (2009)Google Scholar
- 23.Adorjan, M.: “openSfM ein kollaboratives structure-from-motion system”; betreuer/in (nen): M. wimmer, m. birsak; institut für computergraphik und algorithmen. abschlussprüfung: 02.05.2016 (2016)Google Scholar