# Underwater Active Oneshot Scan with Static Wave Pattern and Bundle Adjustment

- 1.7k Downloads

## Abstract

Structured Light Systems (SLS) are widely used for various purposes. Recently, a strong demand to apply SLS to underwater applications has emerged. When SLS is used in an air medium, the stereo correspondence problem can be solved efficiently by epipolar geometry due to the co-planarity of the 3D point and its corresponding 2D points on camera/projector planes. However, in underwater environments, the camera and projector are usually set in special housings and refraction occurs at the interfaces between water/glass and glass/air, resulting in invalid conditions for epipolar geometry which strongly affect the correspondence search process. In this paper, we tackle the problem of underwater 3D shape acquisition with SLS. In this paper, we propose a method to perform 3D reconstruction by calibrating the system as if they are in the air at multiple depth. Since refraction cannot be completely described by a polynomial approximation of distortion model, grid based SLS method solve the problem. Finally, we propose a bundle adjustment method to refine the final result. We tested our method with an underwater SLS prototype, consisting of custom-made diffractive optical element (DOE) laser and underwater housings, showing the validity of the proposed approach.

## Keywords

Underwater scan Camera-projector system One-shot scan## 1 Introduction

The most critical factor affecting the accuracy of an SLS is calibration. Before SLS reconstruction of an object by using a fixed pattern, the intrinsic parameters of the camera and projector as well as the extrinsic parameters relating them should be estimated. After calibration, reconstruction is done from epipolar geometry: the epipolar lines corresponding to the feature points detected on the camera images can be drawn on the fixed pattern image which is projected by the projector. Finally the correspondences can be found by searching along these epipolar lines, and 3D reconstruction performed by triangulating the corresponding points [12].

When the SLS operates within an air medium, the correspondence problem can be solved efficiently by the valid epipolar geometry due to the co-planarity of the 3D point and its corresponding 2D points on camera/projector planes as shown in Fig. 1(a). However, in an underwater environment, the camera and projector (regarded as an inverse camera) are usually set in special housings [17]. Since refraction occurs at the interfaces between water/glass and glass/air, the co-planarity condition is not enforced anymore, as shown in Fig. 1(b). Figure 2(a) and (b) show the line of sight of the camera in the air and underwater, respectively. And thus, it is necessary to find an efficient way to calibrate an underwater SLS, and to ensure that the epipolar assumptions can hold.

This paper proposes three approaches to jointly tackle the aforementioned issues. First, we introduce a depth-dependent calibration method that uses a polynomial approximation model for the SLS for underwater environment. Second, to solve the problem that the epipolar geometry is only approximately valid underwater, we introduce a grid-based active scanning method (specifically, a wave grid pattern) which allows to find correspondences that stray away from the epipolar line while still maintaining a good matching performance. Since the results are based on approximate model, we also introduce a refinement algorithm based on bundle adjustment which uses the wave reconstruction results as the initial parameters to achieve high accuracy. We demonstrate the effectiveness of the proposed approach with simulation as well as a real system with a special housing of camera and pattern projector placed underwater in a pool tank.

## 2 Related Work

Calibration models for underwater camera have been proposed extensively [1, 3, 6, 7, 8, 10, 13, 14, 17]. However, none of them gives an entire calibration and reconstruction procedure for an SLS. Because of the correspondence matching problem in SLS, some of the proposed models becomes invalid since the formulated models do not offer a practical strategy for matching and reconstruction. Besides, projector calibration underwater is also a slightly different issue than camera calibration due to the “blindness” of the projector [4].

There are some early works for underwater 3D reconstruction based on approximation model [3, 13, 14]. Queiroz-Neto *et al. *proposed an underwater model which simply ignores the effects of the refraction, but earns results with low accuracy due to the non-linear refraction effect [14]. Some approximate methods also have been proposed, such as focal length adjustment [14], lens radial distortion approximation [3] and a combination of the two [13]. Unfortunately, the accuracy of these approximation models are also insufficient to an SLS system for correspondence search using epipolar geometry.

To improve the accuracy of underwater measurement, some physical models for camera calibration and reconstruction have been proposed [1, 6, 7, 8, 10, 17]. Agrawal *et al. *gives a general calibration method for underwater cameras, based on a physical refractive model [1]. They consider that all refractive planes are parallel to each other, and they derive front-projection and back-projection equations for their refractive model. However, it is necessary to solve 4th degree equations even for one refractive plane’s case, and 12th degree equations in the 2 plane case in a forward projection situation, and thus, it is difficult to use this method directly for SLS. Sedlazeck *et al. *focus on the underwater light rays which are projected as a curved surface: after learning this surface, perspective projection can be done [17]. According to this method, it is also difficult to tackle the forward projection problem due to the complicated learning phase. Kang *et al. *and Sedlazeck also consider the underwater reconstruction with Structure from Motion (SfM) [6, 7]. SfM is a passive way to recover 3D shape of objects, and it is difficult to achieve a dense reconstruction result due to the difficulty of the correspondence searching. Kawahara et al. proposed pixel-wise varifocal camera model, where the focal length of the projection varies pixel-by-pixel, for modeling non-central projection of an underwater camera, and a calibration method for the cameras [8]. They also proposed an active-stereo system composed of a projector and two cameras, where projection of the cameras and the projector is based on their model [9]. Since image-based correspondence search using epipolar lines are not valid for underwater cameras, they applied space carving method, where only photo-consistency is needed.

In terms of SLS for underwater, Campos *et al. *proposed an underwater active stereo system that uses a DOE-based pattern projector [11]. They used a pattern of parallel lines and each line is not coded into local features. Their decoding method (*i.e.*, the method for solving correspondences between the captured image and the projected pattern) relies on the order between the detected lines on the camera image, thus, ambiguity may occur if only a small region of the pattern is detected.

## 3 Overview

### 3.1 System Configuration

- 1.
With two cameras and the appropriate baseline it is possible to reconstruct areas occluded in one view, thereby reconstructing a much wider area than with conventional monocular active sensing.

- 2.
By using multiple cameras, our system is equivalent to multi-view stereo, so its accuracy can be further improved with Bundle Adjustment.

### 3.2 Algorithm

We adopt a coarse to fine approach for reconstruction. First, the approximated model is used to perform the wave grid reconstruction to retrieve coarse shape. Then, the estimated 3D points are used as initial values for Bundle Adjustment refinement using an accurate non-central projection camera model, which takes into account the refractive environment. The reason why we need the approximation model for the coarse level is that a central projection model does not work in the underwater environment, that means epipolar constraint does not work, however, the epipolar constraint is a key to efficiently find the correspondences with active stereo techniques. Certainly approximation errors inevitably occur at the coarse level, however, those are corrected during the refinement process. Furthermore, there is no practical problem if the deviation of the initial model from the actual model is within the tolerance of the epipolar matching to still produce the correct match.

### 3.3 Polynomial Approximation of Refraction

**Problem Statement.**Before introducing our polynomial approximation model for refraction, let us consider the problem when we perform underwater reconstruction with a full physical refraction model. To simplify the model, we only consider the forward-projection considering one refractive layer introduced in [1]. We suppose that a camera and a projector are all set into housings respectively, and assume that the housings’ thicknesses can be ignored. Figure 5(a) shows the camera model. Coordinate

*x*shows the refractive plane, and the refractive indices of the media above and below this plane are \(\mu _1\) and \(\mu _2\) respectively. The blue line shows a ray coming from a 3D point

*b*, and refraction will occur on the intersection with the plane at point \(p_1 = (x_1,0)\).

*d*is the distance between the

*b*and camera plane, \(x_b\) is the distance between

*b*and the optical axis

*z*. \(y_c\) is the focus of camera. The angle of the incidence is supposed as \(\alpha \) and the refracting as \(\beta \). Based on Snell’s law, the following equations are obtained.

**Polynomial Approximation Model.**We propose a polynomial approximation of the full physical refraction model. As shown in Fig. 5(b), we consider two kinds of light paths. The blue arrows show the light paths which are outgoing from a 3D point, going through a water-air interface, reflected on the surface of an object, and finally going into a camera. The red ones show the same light paths as if through an air medium. Blue and red light paths are outgoing from

*b*, and incoming to the center of a camera

*a*. \(p_1\) is the intersection point of the blue ray and camera plane, and \(p_2\) is the intersection point of the red ray. The most important factor in the polynomial approximation model is the distance between these \(p_1\) and \(p_2\), which is defined as our approximation error. The relationship between the error and \(p_1\) is defined as the following equation.

*r*in Eq. (4) represents the 2-dimensional Euclidean distance between the center of the camera and \(p_1\) in the 3D coordinate axis xyz. During the calibration phase, not only the extrinsic parameters, but also polynomial approximation parameters \(\alpha _1\) and \(\alpha _2\) are estimated. The pinhole projection can then be represented with the Eq. (4) with the approximation model.

## 4 Depth Dependent Calibration

### 4.1 Overview of the Calibration Process

First, the camera and projector are put into their respective housings, and placed into a pool filled with water. After that, the intrinsic parameters of the camera are estimated with a checkerboard [19]. Then, the intrinsic parameters of the projector and the extrinsic parameters between them are estimated by a second calibration using a sphere of known size, described in the next section.

Since the effect of refraction is depth dependent, we conduct the calibration at the multiple depth in the paper. From the multiple calibration results, it is possible to represent the refraction effect with several hyper parameters. However, we take another solution to cope with a depth dependent effect in the paper for simplicity and leave the hyper parameter estimation approach for our future task. In order to retrieve a discrete set of depth-dependent calibration parameters, we put the calibration objects, *i.e.*, checker board planes and sphere, at multiple depth and conduct calibrations independently.

For the selection of the best parameters, the residual errors of epipolar constraints are used. To achieve this, the 3D reconstruction process is conducted for all the parameter sets independently. The sum of residual errors of the correspondences, which are normally errors of epipolar constraints, is calculated and used for the selection of the best result.

### 4.2 Sphere Based Projector Calibration

For sphere-based calibration, images are captured with the pattern projected onto the sphere as shown in Fig. 6(a). The radius of the spherical surface is known. From the image, points on the spherical contour are sampled. Also, the correspondences between the grid points on the camera image and the grid points on the projected pattern are assigned manually.

For the calibration process, we minimize reprojection errors between the imaged grid points on the sphere and the simulated grid positions, with respect to the extrinsic parameters, the intrinsic parameters of the projector, and the position of the calibration sphere. Figure 6(b) shows how the simulated grid positions are calculated. From a grid point (for example, \(\mathrm{g}_{p1}\) in Fig. 6(b)) of the projector, the gird projection on the sphere (\(\mathrm{g}_{c1}\)) is calculated by ray-tracing, and is projected to the camera (\(\mathrm{g}_{i1}\)). If the ray of the grid point does not intersect with the sphere (for example, \(\mathrm{g}_{p2}\)), we use intersection of the ray with an auxiliary plane (\(\mathrm{g}_{c2}\)) that is fronto-parallel and includes the sphere center.

*r*). Thus, the difference between the distance from the spherical object and the radius (\(\sqrt{\Vert \mathbf{c}\Vert ^2-(\tilde{\mathbf{s}}\cdot \mathbf{c})}-r\)) is also considered to be an error. Thus, the sum of squares of these errors is minimized by using Levenberg-Marquardt method.

## 5 3D Reconstruction

### 5.1 Wave Grid Reconstruction

### 5.2 Refinement with Bundle Adjustment

Refinement of 3D shape as well as camera and projector parameters will be conducted by the following way. We set 3D points and a position of the glass between air and water as parameters to be estimated with bundle adjustment. In terms of the position of the glass, it is described with four parameters consisting of a surface normal and a distance between camera center and surface of the glass. Since we can retrieve a hundred of corresponding points between camera and projector image through the wave reconstruction process, we can calculate the reprojection error by simply solving the fourth order polynomial Eq. (3). Leaven-Marquardt algorithm is used to minimize the error.

The main differences from the ordinary bundle adjustment algorithm, which is used for structure from motion or multiple view stereo method, and ours are two folds. First, we use fourth order polynomial equation to calculate 2D coordinates on the image plane back-projected from 3D points considering a refraction between water and air. Second, we include the rigid transformation parameter of the interface plane between water and air to be estimated in the bundle adjustment process.

## 6 Experiments

### 6.1 Depth Dependent Calibration

### 6.2 Wave Oneshot Reconstruction

### 6.3 Evaluation of Refinement Algorithm

First, we checked the effectiveness of the optimaization method with simulation data. We assume underwater environment and emit 7*10 points from a virtual projector to a board of 2 m ahead. Then, we synthesize the image with virtual camera and conduct reconstruction with approximation model. Using the predefined parameters and the synthesized image, we conduct refinement algorithm and results are shown in Figs. 10 and 11. From the results, we can confirm that the reprojection error calculated by solving the fourth order polinomial equation considering a refraciton decreases with our bundle adjustment algorithm and correct shapes are reconstructed.

Finally, we optimize the reconstruction result of the board which is captured by our underwater scanning system. We project the wave pattern on to the planer board, sphere and mannequin and first restored the shape with approximated model, and then, the shape was refined by out bundle adjustment algorithm. As shown in Fig. 12, we can confirm that the reprojection error is drastically decreased and Fig. 13 right two columns (green colored shapes) show that the our refinement algorithm successfully merged two shapes (left two colums, red and white colored shapes) into single consistent shape. For quntitave evaluation, we calculate RMSE for the planer board by fitting the plane to the board and it was drestically decreased from 9.7 mm to 0.7 mm, confirming the effectiveness of our algorithm.

## 7 Conclusion and Future Work

In this paper, we propose a practical oneshot active 3D scanning method in underwater environment. To realize the system, we propose three solutions. First, we calibarate the camera and projector paremeters with polynomial approximation for multiple depth. Then, shapes are reconstructed by wave reconstruction which allows inevitable errors in epipolar geometry. Finally, 3D shapes are refined by the bundle adjustment algorithm which calculates the actural 2D position on the image plane by solving the fourth order polynomial of phisical model. Experiments are conducted with simulation and real environment showing the effectiveness of our method. Temporal constraint to reconver moving object under water environment is our future work.

## References

- 1.Agrawal, A., Ramalingam, S., Taguchi, Y., Chari, V.: A theory of multi-layer flat refractive geometry. In: CVPR (2012)Google Scholar
- 2.Aoki, H., Furukawa, R., Aoyama, M., Hiura, S., Asada, N., Sagawa, R., Kawasaki, H., Shiga, T., Suzuki, A.: Noncontact measurement of cardiac beat by using active stereo with waved-grid pattern projection. In: 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1756–1759. IEEE (2013)Google Scholar
- 3.Ferreira, R., Costeira, J.P., Santos, J.A.: Stereo reconstruction of a submerged scene. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3522, pp. 102–109. Springer, Heidelberg (2005)CrossRefGoogle Scholar
- 4.Fu, X., Wang, Z., Kawasaki, H., Sagawa, R., Furukawa, R.: Calibration of the projector with fixed pattern and large distortion lens in a structured light system. In: The 13th IAPR Conference on Machine Vision Applications (2013)Google Scholar
- 5.Hall-Holt, O., Rusinkiewicz, S.: Stripe boundary codes for real-time structured-light range scanning of moving objects. ICCV
**2**, 359–366 (2001)Google Scholar - 6.Jordt-Sedlazeck, A., Jung, D., Koch, R.: Refractive plane sweep for underwater images. In: Weickert, J., Hein, M., Schiele, B. (eds.) GCPR 2013. LNCS, vol. 8142, pp. 333–342. Springer, Heidelberg (2013)CrossRefGoogle Scholar
- 7.Kang, L., Wu, L., Yang, Y.-H.: Two-view underwater structure and motion for cameras under flat refractive interfaces. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part IV. LNCS, vol. 7575, pp. 303–316. Springer, Heidelberg (2012)CrossRefGoogle Scholar
- 8.Kawahara, R., Nobuhara, S., Matsuyama, T.: A pixel-wise varifocal camera model for efficient forward projection and linear extrinsic calibration of underwater cameras with flat housings. In: 2013 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 819–824. IEEE (2013)Google Scholar
- 9.Kawahara, R., Nobuhara, S., Matsuyama, T.: Underwater 3D surface capture using multi-view projectors and cameras with flat housings. IPSJ Trans. Comput. Vis. Appl.
**6**, 43–47 (2014)Google Scholar - 10.Lavest, J.-M., Rives, G., Lapresté, J.T.: Underwater camera calibration. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 654–668. Springer, Heidelberg (2000)CrossRefGoogle Scholar
- 11.Massot-Campos, M., Oliver-Codina, G.: Underwater laser-based structured light system for one-shot 3D reconstruction. In: 2014 IEEE SENSORS, pp. 1138–1141. IEEE (2014)Google Scholar
- 12.Mazaheri, M., Momeni, M.: 3D modeling using structured light pattern and photogrammetric epipolar geometry. Int. Arch. Photogrammetry Remote Sens. Spat. Inf. Sci.
**37**, 87–90 (2008)Google Scholar - 13.Pizarro, O., Eustice, R., Singh, H.: Relative pose estimation for instrumented, calibrated imaging platforms. In: DICTA, pp. 601–612. Citeseer (2003)Google Scholar
- 14.Queiroz-Neto, J.P., Carceroni, R., Barros, W., Campos, M.: Underwater stereo. In: Proceedings of the 17th Brazilian Symposium on Computer Graphics and Image Processing, pp. 170–177. IEEE (2004)Google Scholar
- 15.Sagawa, R., Ota, Y., Yagi, Y., Furukawa, R., Asada, N., Kawasaki, H.: Dense 3D reconstruction method using a single pattern for fast moving object. In: ICCV (2009)Google Scholar
- 16.Sagawa, R., Sakashita, K., Kasuya, N., Kawasaki, H., Furukawa, R., Yagi, Y.: Grid-based active stereo with single-colored wave pattern for dense one-shot 3D scan. In: 3DIMPVT, pp. 363–370 (2012)Google Scholar
- 17.Sedlazeck, A., Koch, R.: Calibration of housing parameters for underwater stereo-camera rigs. In: BMVC (2011)Google Scholar
- 18.Young, M., Beeson, E., Davis, J., Rusinkiewicz, S., Ramamoorthi, R.: Viewpoint-coded structured light. In: CVPR, June 2007Google Scholar
- 19.Zhang, Z.: A flexible new technique for camera calibration. Technical report MSR-TR-98-71, 12 (1998)Google Scholar