Abstract
In this paper, we propose a new self-calibration algorithm for upgrading projective space to Euclidean space. The proposed method aims to combine the most commonly used metric constraints, including zero skew and unit aspect-ratio by formulating each constraint as a cost function within a unified framework. Additional constraints, e.g., constant principal points, can also be formulated in the same framework. The cost function is very flexible and can be composed of different constraints on different views. The upgrade process is then stated as a minimization problem which may be solved by minimizing an upper bound of the cost function. This proposed method is non-iterative. Experimental results on synthetic data and real data are presented to show the performance of the proposed method and accuracy of the reconstructed scene.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
A projective space can be reconstructed robustly from 2D correspondences across multiple uncalibrated images by one of the several proposed projective reconstruction methods [11–13, 28, 29]. Projective space, however, does not contain sufficient information for human perception of the 3D scene. Upgrading from projective to Euclidean space is necessary for visualization or virtual navigation.
The upgrade process is referred to as ‘Euclidean reconstruction’ or self-calibration. The proposed algorithm in this paper is a kind of self-calibration where the camera intrinsic and extrinsic parameters are computed from 2D correspondences without any knowledge about the scene and cameras.
In this paper, we propose a new self-calibration algorithm for upgrading the reconstructed projective space to Euclidean space.
Three usual conditions that can be used for Euclidean reconstruction are:
-
1.
the zero-skew constraint;
-
2.
the unit aspect-ratio constraint and
-
3.
the (partial) constant principal-point constraint.
These three constraints are formulated in the same framework such that the algorithm treats every view and constraint equally. Our proposed algorithm is flexible since the above three constraints can be customized for any specific situation.
The whole process is non-iterative with the runtime of this algorithm being proportional to the number of applied constraints.
The paper is organized as follows. We first provide a literature review on self calibration in Sect. 2. The relationship between the dual image of the absolute conic and the absolute dual quadric for camera calibration is briefly described in Sect. 3. The problem of Euclidean reconstruction is formulated in Sect. 4. The theory of the proposed algorithm is derived in Sect. 6.
The complete algorithm is provided in Sect. 7. Experimental results are given in Sect. 8. The proof of rank-4 properties is given in Sect. 9. Some additional constraints are provided in Sect. 10. Section 11 contains some concluding remarks.
2 Literature Review
In classical calibration methods, the camera intrinsic (i.e. K i ) and extrinsic (i.e. rotation, R i and translation, t i ) parameters are computed from images of a calibration board with known grid patterns. The intrinsic and extrinsic parameters can be estimated accurately. Self-calibration, a new trend of camera calibration, is to obtain the camera intrinsic and extrinsic parameters from point correspondences of unknown objects instead of known objects. Maybank and Faugeras [14] proved that self-calibration is possible when the intrinsic parameters are fixed over a sequence of images by solving the Kruppa equations [4, 14, 32], which are a set of non-linear constraints on the intrinsic parameters. However, the result is very sensitive to noise. Sturm [27] pointed out that the Kruppa equations will fail for some non-critical motions (which are non-degenerate configurations for other self-calibration methods). Gurdjos et al. [7], Sturm [26] also studied ‘artificial critical motion sequences’ for linear self-calibration algorithms [1, 7, 8, 18, 31].
Hartley [9] proposed a series of non-linear algorithms to reconstruct the Euclidean model and camera parameters from 2D correspondences by assuming constant intrinsic parameters. Pollefeys and Gool [17] proposed the modulus constraint to recover the affine space from projective space and solve for the dual images of the absolute conic to upgrade an affine space to a Euclidean space. This method relies on the assumption of constant intrinsic parameters. There are at most 64 solutions for the locations of the plane at infinity but not all of them are sensible solutions. It is costly to solve for the 64 possible solutions by the usual continuation method. A new derivation of the modulus constraint directly from three views is proposed by Schaffalitzky [22]. The number of feasible solutions is then reduced to 21 and a numerical algorithm is also provided for computing them efficiently. However, there is still the problem of deciding which one is the correct solution.
Fundamental matrices between any two views can be computed easily from a projective reconstruction. A simple non-linear method [15] decomposes the fundamental matrices into essential matrices by enforcing the property of the singular values (i.e. λ={e −iθ,e iθ,0}) of an essential matrix iteratively. The problem of Euclidean reconstruction is formulated as a minimization problem directly parameterized in terms of all intrinsic parameters (and there is no constraint on the intrinsic parameters). More detail experimental results can be found in [5]. There is, however, a lack of proof of convergence and the result relies on a good initial guess.
Instead of direct parameterizations on intrinsic parameters, there are some implicit methods [1, 2, 8, 12, 16, 18, 20, 23–25, 31] for recovering the projective distortion matrix to upgrade projective to Euclidean space. Triggs [31] proposed to estimate the absolute quadric of the projective space based on the assumption of constant intrinsic parameters. This non-linear method requires at least four views.
If the skew factors are zero and both the principal points and the aspect ratios are known, the projective distortion matrix H∈ℝ4×4 can be solved linearly [2, 8, 16, 18, 19].
The method proposed by Han and Kanade [8] needs at least 8 views to solve for the absolute quadric [31] linearly. Sainz et al. [21] further developed the method of [8] for enforcing the rank-3 property of the absolute quadric with the assumptions that all the cameras have zero-skew, unit aspect-ratio and known principal points. This rank-3 property can be obtained by solving a 4th order polynomial which is the determinant of a linear combination of two possible solutions in one parameter. Practically, there will not be exactly two possible solutions only when there is at present of noise. Also, the important positive semi-definite property of absolute dual quadric is also not enforced in their algorithm. Pollefeys et al. [19] proposed to formulate the linear approach of Han and Kanade [8] differently from [21] for the assumptions that all cameras have zero-skew, unit aspect-ratio and known principal points for at least 3 views. A direct parameterization is proposed to solve for the 5-unknown simplified absolute quadric which is based on the special choices of the first camera projection matrices in the projective and the upgraded Euclidean space. When there are only 2 views, the solution is determined up to a one parameter family of solutions. Similarly to [21], the rank-3 property constraint on the simplified absolute quadric can be imposed on solving the 1-parameter but there will be 4 possible solutions.
Heyden and Åström [12] introduced a camera with Euclidean image plane when the camera satisfies the two conditions: zero-skew and unit aspect ratio. They also proved that it is possible to upgrade projective to Euclidean space when the cameras have Euclidean image planes. Seo and Heyden [23] proposed another iterative linear algorithm to solve for the absolute dual quadric. This method needs a lot of iterations for convergence and the numerical stability is not considered. Seo and Hong [25] proposed a linear approach to estimate the absolute dual quadric by complex eigen-decomposition. As the method is only developed for zero-skew constraints, it will not make use of other available constraints (such as known aspect-ratio and principal point). Seo and Heyden [24] further proposed to alternatively estimate the absolute dual quadric by applying a linear method [18] and re-estimate the principal points. The applied linear method is proposed by Pollefeys et al. [18] to solve the absolute dual quadric linearly by assuming zero skew, known aspect-ratio and principal point.
Having the same assumptions on an image sequence with Euclidean image plane, Bougnoux [1] proposed a closed-form solution for calculating the focal lengths and the plane at infinity for upgrading to Euclidean reconstruction with a ‘visually perfect’ result. It is proved in [1] that this method has an ambiguity of the anisotropic homothety so that the upgraded 3D scene is only good by visual verification and the estimated intrinsic parameters are not accurate. The iterative methods suffer from the usual problem that the iterative algorithm should be initialized by a sufficiently accurate guess.
In this paper, we propose a new self-calibration algorithm to estimate the projective distortion matrix. The method of recovering the projective distortion matrix is formulated in a subspace framework. The proposed method is also based on solving the absolute dual quadric [10]. We unify most of the common constraints on intrinsic parameters (such as zero-skew constraint, unit aspect-ratio constraint, constant principal points etc.) within the same subspace framework. The proposed algorithm is simple and flexible for combining different assumptions in a single minimization problem. The derivations of different constraints for different assumptions will be provided. Some features of the proposed algorithm are:
-
1.
a non-iterative algorithm;
-
2.
views are treated equally and all constraints are treated equally;
-
3.
options for combining different constraints.
3 Background
3.1 Dual image of the absolute conic and absolute dual quadric
In the Euclidean space, the absolute dual quadric is defined as
Let us denote a rigid transformation T as
where R is a 3×3 rotational matrix and t is a 3×1 translation vector. The absolute dual quadric transformed by T can be expressed as
which shows that the absolute dual quadric is invariant to any rigid transformation. The absolute dual quadric can be projected to any camera and its image on the image plane is called the dual image of the absolute conic (DIAC), \({\omega}_{i}^{\ast }\) [10]. If the projection matrix of ith view is P i =K i [R i |t i ], its dual image of the absolute conic is given by
Hence, the dual image of the absolute conic \({\omega}_{i}^{\ast }\) is only related to the intrinsic parameters of the ith view.
4 Problem Formulation
A projective frame can be reconstructed from 2D correspondences across multiple views by projective reconstruction methods [11–13, 28, 29]. We choose the method of Hung and Tang [13] to minimize 2D reprojection error. To upgrade the reconstructed projective frame (\(\hat{{P}}_{i}\) and \(\hat{\mathbf{X}}_{j} \)) to a Euclidean frame (\(\tilde{{K}}_{i}, \tilde{{R}}_{i}, \tilde{\mathbf{t}}_{i}\) and \(\tilde{\mathbf{X}}_{j}\)), metric constraints are applied to the reconstructed projective projection matrices \(\hat{{P}}_{i}\) to recover the projective distortion matrix, H∈ℝ4×4 so that all the upgraded projection matrices \(\tilde{{P}}_{i}=\hat{{P}}_{i}{H}\) can be decomposed as \(\tilde{{P}}_{i}=\tilde{{K}}_{i} [ \tilde{{R}}_{i}^{T}|{-}\tilde{{R}}_{i}^{T} \tilde{\mathbf{t}}_{i} ]\) and the Euclidean shape is then given by \(\tilde{\mathbf{X}}_{j}\sim {H}^{-1}\hat{\mathbf{X}}_{j}\). A camera matrix K i can be parameterized as
where s i is the skew ratio, \([ u_{i}\;v_{i}\;1 ]^{T}\) is the principal point, f i is the scaling factor (focal length to pixel size ratio) and α i is the aspect ratio for the ith view. Substituting K i from (2) into (1) gives
To relate the projective distortion matrix H to the dual image of the absolute conic \({\omega}_{i}^{\ast }\), first denote H as
where H 1∈ℝ4×3 is the first 3 columns of H and H 2∈ℝ4×1 is the last column of H. The upgraded projection matrix can be expressed as
where M i is defined as
and m 1i ,m 2i and m 3i are 3-vectors. The absolute dual quadric Ω ∗ can be projected onto the image planes as the dual images of the absolute conic. Similarly to (1), the projection of the absolute dual quadric by the upgraded projection matrix \(\tilde{{P}}_{i}\) in (5) can also be expressed as
The both projections can be combined as
where \({Q}={H}_{1}{H}_{1}^{T}\) ∈ℝ4×4 is the absolute dual quadric, and the dual image of the absolute conic \({K}_{i}{K}_{i}^{T}\) is its dual image on the ith view.
In Sect. 5, we will show how to determine H 2 linearly by choosing the world origin at the centroid of the upgraded 3D points. In Sect. 6, a flexible approach to solve H 1 from user selected constraints (such as zero-skew constraint, unit aspect-ratio constraint and/or partial constant principal-point constraints) is proposed.
5 Estimating H 2
To estimate H 2, we choose the centroid of the scaled upgraded 3D points \(\upsilon_{j} \tilde{\mathbf{X}}_{j} = {H}^{-1}\hat{\mathbf{X}}_{j}\) at the origin so that
where \(\varUpsilon=\sum_{j=1}^{n} \upsilon_{j}\) and \(\tilde{\mathbf{X}}_{j}\ (j=1, \ldots , n)\) are the upgraded 3D points in homogeneous coordinates in a Euclidean frame. The projection equation for \(\tilde{\mathbf{X}}_{j}\) can be expressed as
The scale factor λ ij in (9) can be obtained from w ij , \(\hat{{P}}_{i}\) and \(\hat{{X}}_{j}\) in the reconstructed projective frame. Summing all scaled 2D points for the ith view, we get
Equation (10) can be formulated as a least-squares problem of estimating H 2 from all views, such that
H 2 can then be estimated by solving (11) as a linear least-squares problem by ignoring ϒ as H 2 can be determined up to scale. By counting argument, we require 3m≥4. Thus, the minimum number of views, m for solving H 2 is 2. This choice on the upgraded Euclidean coordinates is adapted from the approach of factorization method on orthogonal projection [30].
6 Estimating H 1 (Absolute Dual Quadric)
From (8), the absolute dual quadric Q is equal to \({H}_{1}{H}_{1}^{T}\) and is rank-3 as H 1 is rank-3. Denote Q as
Collect the 10 variables of Q into a vector as
From (6) and (8), each \({K}_{i}{K}_{i}^{T}\) can be written as
By (8), each element of this matrix is linear in the 10 elements (i.e. q k , ∀k) of Q. The absolute dual quadric Q can be obtained by applying different constraints on the dual image of the absolute conics \({M}_{i}{M}_{i}^{T}\). Three different constraints on the dual image of the absolute conic will be considered here.
-
1.
Zero-skew constraint
Recalled that \(\tilde{{P}}_{i}=\tilde{{K}}_{i} [\tilde{{R}}_{i}^{T}|-\tilde{{R}}_{i}^{T}\tilde{\mathbf{t}}_{i} ]\) and using (5), we have \({M}_{i}=\tilde{{K}}_{i}\tilde{{R}}_{i}^{T}\). Let the orthogonal matrix \(\tilde{{R}}_{i}=[\mathbf{r}_{1i} \; \mathbf{r}_{2i} \; \mathbf{r}_{3i}] \in \mathbb{R}^{3 \times 3}\), the three columns of \({M}_{i}^{T}\) can be expanded as
(15)Let us define the zero-skew constraint following Faugeras [3] as
$$ \phi_z({M}_i)= ( \mathbf{m}_{1i}\times \mathbf{m}_{3i} ) \cdot ( \mathbf{m}_{2i}\times \mathbf{m}_{3i} ). $$(16)Expanding (16) by the expressions from (15), we have
$$ \phi_z({M}_i)=\alpha_i f_i s_i. $$(17)Hence, the zero-skew constraint can be written as
$$ \phi_z({M}_i)=0. $$(18)Equation (18) can be treated as a 4D ruled quadric in a 10D space as
$$ \mathbf{q}^T{\varPhi}_i^z\mathbf{q}=0 $$(19)where \({\varPhi}_{i}^{z}\) is a 10×10 symmetric matrix and it is of rank-4. The derivation of an expression for \({\varPhi}_{i}^{z}\) and the proof of the rank-4 property are given in Sect. 9.
-
2.
Unit aspect-ratio constraint
Define the unit aspect-ratio constraint ϕ u (M i ) following Faugeras [3] as
$$ \phi_u( {M}_i)=\vert \mathbf{m}_{1i}\times \mathbf{m}_{3i} \vert ^2-\vert \mathbf{m}_{2i}\times \mathbf{m}_{3i} \vert ^2. $$(20)Expanding (20) by (15), we have
$$ \phi_u({M}_i)=|s_i \mathbf{r}_{1i}-f_i \mathbf{r}_{2i}|^2-\alpha_i^2f_i^2=s_i^2+ \bigl(1-\alpha_i^2\bigr)f_i^2. $$(21)For general cameras, s i is almost zero and the magnitude of f i is usually of the order of several thousands times that of s i . As f i ≫s i , the skew factor s i is negligible compared with f i . When the zero-skew constraint is enforced, (21) is equal to zero only if \(\alpha_{i}^{2}=1\). The unit aspect-ratio constraint can therefore be imposed as
$$ \phi_u({M}_i)=0. $$(22)Equation (22) can also be treated as a 4D ruled quadric in a 10D space as
$$ \mathbf{q}^T{\varPhi}_i^u \mathbf{q}=0 $$(23)where the symmetric matrix \({\varPhi}_{i}^{u}\in \mathbb{R}^{10\times 10}\) is of rank-4. An expression for \({\varPhi}_{i}^{u}\) is given in Sect. 9.
-
3.
Constant principal-point constraints
For the ith and jth views, constant principal-point constraints consist of two equations u i =u j and v i =v j . By comparing the entries (1,3) and (2,3) of \({K}_{i}{K}_{i}^{T}\) and \({M}_{i}{M}_{i}^{T}\), the two constraints can be expressed as
Both conditions can be transformed into quadratic equations in q, namely
$$ \mathbf{q}^T{\varPhi}_{ij}^x \mathbf{q}=0 $$(24)and
$$ \mathbf{q}^T{\varPhi}_{ij}^y \mathbf{q}=0 $$(25)where \({\varPhi}_{ij}^{x}\) and \({\varPhi}_{ij}^{y}\) are 10×10 symmetric matrices and they are of rank-4.
Equations (19), (23), (24) and (25) are quadratic equations in q. The derivations of expressions for \({\varPhi}_{i}^{z}\), \({\varPhi}_{i}^{u}\), \({\varPhi}_{ij}^{x}\) and \({\varPhi}_{ij}^{y}\) and the proof of the rank-4 properties are given in Sect. 9. It can be shown that each of \({\varPhi}_{i}^{z}\), \({\varPhi}_{i}^{u}\), \({\varPhi}_{ij}^{x}\) and \({\varPhi}_{ij}^{y}\) has 4 non-zero eigenvalues of which two are positive and two are negative. This kind of quadric is called a ‘ruled quadric’ [10].
6.1 Cost Function for Estimating the Absolute Dual Quadric Q
From the above formulations of the constraints, the determination of H 1 can be posed as a non-linear minimization problem with cost function
where Φ k can be any ruled quadric representing constraints, k is the index for the summation over all the included constraints and M is the total number of selected constraints. For example, when there are m cameras and both zero-skew and unit aspect-ratio constraints are applied on all views, the number of constraints will be M=2m. In the form of (17) and (21), the constraints are weighted by different scaling factors such as α i f i and \(f_{i}^{2}\). To equalize this weighting effect on each Φ k , we divide each Φ k by its own eigenvalue with the largest magnitude.
6.2 Solving Its Upper Bound
Let us denote the eigenvalue decomposition of the symmetric matrix Φ k as \({V}_{k} {\varLambda}_{k} {V}_{k}^{T}\) where Λ k is a diagonal matrix containing all eigenvalues of Φ k and V k is an orthogonal matrix containing the corresponding eigenvectors. The eigenvalues in Λ k are sorted in descending order. Denote the diagonal sub-blocks of Λ k containing the positive, zero and negative eigenvalues as \({e}_{k}^{+} \in \mathbb{R}^{2 \times 2}\), 06×6, \({e}_{k}^{-} \in \mathbb{R}^{2 \times 2}\), respectively.
Each constraint can be expressed as
We then define
Note that \({\varPhi}_{k}^{*}\) is a positive semi-definite matrix. We can obtain an upper bound of (26) by a minimization problem
where \({\varPhi}^{*}= (\sum_{k=1}^{M} {\varPhi}_{k}^{*} )\). As each \({\varPhi}_{k}^{*}\) is previously normalized so its largest eigenvalue is 1, the relationship between ε Q and \({\varepsilon}_{Q}^{*}\) can be written as
The minimum of (28) is equal to the smallest eigenvalue of Φ ∗ in (28) and its corresponding eigenvector is chosen as q. As Φ ∗∈ℝ10×10, to obtain a unique solution to the minimization problem (28), Φ ∗ should be close to rank-9. From our experiments in real and synthetic data, the minimization problem (28) always returns a relatively small value of the order of 10−5 for each constraint.
6.3 Decomposing H 1 from Q
To compute H 1 from the estimated absolute dual quadric Q, Q must be a rank-3 positive semi-definite matrix. Empirically, when the number of views is large enough, Q formed from q by (12) and (13) will be close to rank-3 satisfying the positive semi-definite condition.
By means of singular value decomposition, the computed absolute dual quadric Q can be factorized as Q=USU T. Take
where U 3 is the first three columns of U and S 3 is the left upper 3×3 matrix of S.
The proposed algorithm is shown in Algorithm 1.
7 Self-calibration (Decomposition of K, R, t)
After the projective distortion matrix H has been recovered, all the projection matrices \(\hat{{P}}_{i}\) in the projective frame can be upgraded to a Euclidean frame as
and the projective shape \(\hat{\mathbf{X}}\) can be upgraded to the Euclidean frame as
To extract the intrinsic parameters and extrinsic parameters from projection matrices in the Euclidean frame, we can apply the QR factorization [10] to decompose the left-most 3×3 matrices of \(\tilde{{P}}_{i}\) to \(\alpha_{i}\tilde{{K}}_{i} \tilde{{R}}_{i}\). The decomposition should satisfy that all the diagonal values of \(\tilde{{K}}_{i}\) must be positive, \(\tilde{{K}}_{i} ( 3,3 ) =1\) and the determinant of the rotational matrix \(\tilde{{R}}_{i}\) should be equal to 1, i.e. \(\vert \tilde{{R}}_{i}\vert =1\). All the requirements can be enforced during the decomposition. Then the translation of the cameras can be easily obtained by applying \({-}\frac{1}{\alpha_{i}}\tilde{{R}}_{i}\tilde{{K}}_{i}^{-1}\) to the last column of \(\tilde{{P}}_{i}\). The upgraded Euclidean projection matrices are in the form of
The Self-Calibration Algorithm is summarized in Algorithm 2.
8 Experimental Results
In this section, the proposed method is evaluated using synthetic data and real data. The reconstructed projective spaces are computed first by the methods proposed by Tang and Hung [13, 28] for minimizing the 2D reprojection error.
8.1 Synthetic Data
A synthetic scene has been constructed with 3 virtual grid planes (i.e. three planes of the blue dotted box) as shown in Fig. 1. On each plane, there are 25 lattice points of a 4×4 grid and the dimension of the grid is 0.4 m×0.4 m, so there are a total of 75 3D points in the scene. Each plane is perpendicular to the other two planes. The intrinsic parameters for all cameras are fixed as
There are 10 cameras randomly located within the red dashed box (of size 3 m×3 m×2 m) with fixed intrinsic parameters and pointing towards the centroid of the 3D points such that the images of the 3D points almost fully occupy an image size of 1000×800 for all the views. The yellow pyramids in the Fig. 1 are the cameras. We will only apply the zero-skew and unit aspect-ratio constraints in the experiments. There are two sets of evaluation results. The first set is to evaluate the reconstructed scene. The angles between any two planes are then computed for assessing the orthogonality of the upgraded scene. The second set is a comparison between the estimated intrinsic parameters of the cameras and ground truth at different levels of Gaussian noise. Gaussian noise with standard deviations from 0 to 4 pixels are added to images with increments of 0.5 pixel. The tests are repeated 50 times and the mean values are computed.
8.1.1 Orthogonality of Planes in the Upgraded Scene
There are three orthogonal planes for this synthetic scene. In each trial, each plane is computed by minimizing the geometric error of the normal distances from the 3D points to the plane. RMS error of deviations of the angles from 90∘ are calculated, as shown in Fig. 2(a). The maximum derivation of angles is less than 0.14∘ even when the 2D points are contaminated by Gaussian noise with σ=4 pixels. The error of the angles is gradually increased with the level of added noise.
8.1.2 Performance on Intrinsic Parameters
Figure 2(b) shows how the estimated intrinsic parameters varies against noises. There are 50 trials for each noise level. Figure 2(b) present the intrinsic parameters in an arrangement similar to the matrix form of K. The unit aspect-ratio constraint works well since the two diagrams at the position (1, 1) and (2, 2) in Fig. 2(b) are almost the same and slightly larger than the expected value 2000 (but by no more than 0.5 %). The zero-skew constraint forces the skew factors to be around 0. The principal points are also around (500, 500).
Histograms showing the distributions of the estimated intrinsic parameters (for the 50 trials) for the cases that 2D points contaminated by low noise level and high noise level are given in Figs. 2(c) and 2(d) respectively. When Gaussian noise with standard deviation of 1 pixel is added, the variations of the parameters are highly concentrated around the ground truth and the maximum derivation is less than 3 % for the principal point and the focal length. When Gaussian noise with standard deviation of 4 pixels are added, the distribution of the parameters are spread wider and the corresponding maximum derivation is around 12 %.
8.2 Real Image Sequences
There are 12 real image sequences selected for testing the proposed method. The first 7 standard image sequences are obtained from the Visual Geometry Group (VGG) at University of Oxford. The remaining 6 image sequences are taken with the same DSLR camera and lens. The image sequences, ‘Mickey Mouse’ and ‘Dr. Sun Statue’, are taken with random motion while auto-focus function is enabled so that objects can be seen clearly and be located at the centers of each image. The other 4 image sequences (i.e., ‘Tigger’, ‘Spiderman’ and ‘Terra-cotta Warrior’) are taken with the objects undergoing circular motion on a turn table and the camera fixed on a tripod. Each shot is taken after the object is rotated 10∘ degrees. For real image sequences, we use only the zero-skew and unit aspect-ratio constraints since these two constraints can be assumed to be well satisfied in high quality cameras.
Figure 3 shows the results of two sets of VGG real data. Figure 3(a) shows the 1st image of the Model House image sequence. Most of the 2D corresponding points are lying on the 3 main planes in the scene, i.e., the floor, the front wall which is perpendicular to the floor and the front side of the roof. After applied our method on the projective reconstruction result by Hung and Tang [13], one of the images is projected on the upgraded 3D points as a textured 3D model and a side view of the textured model with the cameras are shown in Fig. 3(b). Similarly, another set of results for Merton College II sequence is shown in 3(d).
Two more sets of results reconstructed from our own image sequences are shown in Fig. 4. For the Dr. Sun Statue image sequence, Fig. 4(a) shows the 6th image. The cameras were moved around the statue by a photographer nearly on an eye-sight level horizontal plane. The reconstructed scene with cameras are shown in Fig. 4(b). From the side view of the textured 3D model in Fig. 4(c), the shape of Dr. Sun Statue can be seen clearly. Our method is also applied to upgrade a projective reconstruction from a circular motion image sequence as in Fig. 4(d). There are 36 images taken surrounding the Terra-cotta Warrior on a turn table with rotating angle in 10∘ degrees for every consecutive image. To illustrate the reconstructed shapes of the sculpture, there are 3 different views captured from different angles of the 3D cloud points from Fig. 4(e) to Fig. 4(g). The results from the other image sequences are not shown here as they are similar to the graphs shown.
Table 1 shows the performance of the proposed method by means of two sets of numerical data. The first set is for estimating the absolute dual quadric Q and the second set is for solving H 1. For the first results for estimating Q, the number of constraints, the 1st, 9th and 10th singular values of Φ ∗ are listed as M, s 1, s 9 and \({\varepsilon}_{{Q}}^{*}=s_{10}\) respectively. The results show that a distinctive null space of Φ ∗ can be identified for each of the data sets, despite the relaxation of the original minimization problem (26) to (28). The next column, ε Q (q), denotes the values of the original cost function (26) applied with the singular vectors corresponding to the smallest singular values s 10. It shows that all results satisfy the inequality relationship (29) and all \({\varepsilon}_{{Q}}^{*}=s_{10}\) are relatively very small compared with their corresponding s 1 and s 9. The rank-9 condition is fulfilled for estimating Q. The second set, namely the ratio s 3/s 1 and s 4/s 1 of the singular values of Q, shows that Q is approximately of rank-3.
8.3 Comparison
A linear method proposed by Pollefeys et al. [19] is implemented for comparison with simulation and real data. This linear method assumes that the varying focal lengths across multiple views are the only unknown. The zero-skew, unit-aspect ratio and known principal point constraints are directly enforced in the formulation. The image sequence, Oxford Model House from VGG is selected. The set of 2D corresponding points \(\hat{\mathbf{w}}_{ij}\) across multiple images, the reconstructed 3D points M j , and reconstructed camera intrinsic \(\hat{{K}}_{i}\) and extrinsic parameters, \(\hat{{R}}_{i}\) and \(\hat{\mathbf{t}}_{i}\) (i.e., the projection matrices \(\hat{{P}}_{i} = \hat{{K}}_{i} \hat{{R}}_{i}^{T} [{I}_{3} \ |{-}\hat{\mathbf{t}}_{i}]\)) are also provided and taken as ground truth in this paper. The image sizes are 768×576 pixels. For the reconstructed cameras in Euclidean space, the focal lengths are varying between 594 and 672 pixels and the maximum variation of the principal points between the 10 images is 93.5 pixels.
8.3.1 Synthetic Data
To compare both methods with the reconstructed data for simulation, we generated a random projective distortion matrix H∈ℝ4×4 to downgrade the Euclidean space down to a projective space. To satisfy the constraints of the method of [19], 2D points and projection matrices are first transformed by the known intrinsic parameters. Variations on the principal points are denoted as \((\Delta c_{i}^{x},\Delta c_{i}^{y})\) on x- and y-axes respectively. Hence, the projection matrices and 3D points in the projective space and the corresponding 2D points for ith view are as follows
and
where \(\bar{f}\) is the mean value of the ground truth focal lengths and it is 638.52 pixels. Both methods are applied to upgrade the projective space back to Euclidean spaces and their estimated focal lengths in x- and y-axes for the ith view and the kth trial are denoted as \(f^{x}_{ik}\) and \(f^{y}_{ik}\) respectively. Our method is used with the zero-skew and unit-aspect ratio constraints only. The root mean square error (RMSE) on the focal lengths, ϵ, across multiple views and trials for a given level of variation on the principal points, it is expressed as
where the number of trials is L and the number of view is m. After 200 trials (i.e., L=200), the results are summarized in Fig. 5. Except for the cases when added noise levels are less than 10 pixels, there are cases that the method of [19] failed to return reasonable solutions. The failed cases usually will be rank-deficient on solving the least-square problem, the reconstructed space is still as mess as a projective space or becomes a planar object, etc. The ratio of failed cases versus added noises on the principal points is shown in Fig. 5(a). The percentage of failed cases can be more than 50 % in some noise levels. After added 30 pixels shift on the principal points, the percentage of failed cases become at least 30 %. However, in this test, our proposed method always return a reasonable solution.
Figure 5(b) shows the RMSE of focal lengths ϵ F versus the added noises on the principal points. The failed cases from the method of [19] are removed for plotting Fig. 5(b). The method of [19] can return exact solutions when the noise level is zero. The RMSE of focal lengths is almost proportional to the added noise levels on the principal points when the failed cases are taken out. Our method cannot return the exact solution even though there is no noise added. It is because the original minimization problem (26) is replaced by its upper bound (28). There is nothing to force the minimum values of both minimization problems to be the same. Our method is almost invariant to the added noises and the returned errors on the focal lengths maintain a certain amount. To illustrate how the upgraded 3D spaces look like, we selected the results for the case that principal points are shifted by 200 pixels and the method of [19] can return a reasonable solution. The results of the two methods are shown in Figs. 5(c) and 5(d). Our proposed method can return a better result that the wall is much closer to be perpendicular to the floor plane.
8.3.2 Real Data
Based on the same set of Oxford Model House in the previous section, we use the projective bundle adjustment method proposed by Hung and Tang [13] to reconstruct 3D points from 2D correspondences across multiple views. The 2D points in (34) is first shifted (the original 2D points also contain 2D errors) by \((\Delta c_{i}^{x},\Delta c_{i}^{y})\) on the principal points and passed into projective reconstruction method [13]. The method of [19] and our method are then applied on the projective spaces for upgrading. There are also 200 trials for each noise level. The results are shown in Fig. 6. The RMSE 2D reprojection errors from the projective bundle adjustment [13] are around 0.5 pixels. From Fig. 6(a), our method is more stable than the method of [19]. There is no failed case reported for our proposed method. Figure 6(b) shows that our method is still invariant to any variations on the principal points while computing the absolute dual quadrics. Similarly, the side views of the reconstructed 3D points from our method and the method of [19] are shown in Figs. 6(c) and 6(d) respectively.
9 Proof for Rank-4 Properties of Subspace Constraints
To prove the rank-4 properties, we first derive the dual image of the absolute conic in term of \({H}_{1}{H}_{1}^{T}\) and P. For simplicity, all the sub-indices i will not be shown. The projective projection matrix \(\hat{{P}}\) can be denoted as \(\hat{{P}}= [ \mathbf{p}_{1}\;\mathbf{p}_{2}\;\mathbf{p}_{3} ]^{T}\), where p k is the 4-vector of the kth row of \(\hat{{P}}\). The dual image of the absolute conic ω ∗ can be expressed as
The 3×3 matrix in (14) is exactly the same as (36) elementwise. The derivations for each constraint are shown with the notation in (36).
9.1 Zero-Skew Constraint
The zero-skew constraint from (18) is expressed in notation of (6) as
To relate this constraint to the quantities in (14), the cross product operators should be transformed to dot product operators. First, by applying a 3D cross product property, A⋅(B×C)≡−C⋅(B×A), the zero-skew constraint becomes
To expand the rest of cross product, another property A×(B×C)≡B(A⋅C)−C(A⋅B) is applied, then,
Replacing the entries from (36), (37) becomes
Applying Kronecker product notation to (38), it becomes
where h is defined as the vector obtained by applying the stack operator to \({H}_{1}{H}_{1}^{T}\) (i.e. link up all the columns from left to right into a single vector),
Let us define v 1=p 3⊗p 1, v 2=p 3⊗p 2, v 3=p 2⊗p 1 and v 4=p 3⊗p 3. (39) can be simplified as
The quadratic forms, \(\mathbf{v}_{1} \mathbf{v}_{2}^{T}\) and \(\mathbf{v}_{3} \mathbf{v}_{4}^{T}\) can always be written as a sum of squares of linear functions of h as
f z (h) can be expressed as
where
Since \({H}_{1}{H}_{1}^{T}\) is a symmetric matrix, among the 16 elements of h, only 10 are independent variables. Hence, h can be expressed in terms of q∈ℝ10×1 by means of a binary matrix, Z∈ℝ16×10 as
Substitute (42) into (41), we have
where
In general, the projection matrix \(\hat{{P}} = [ \mathbf{p}_{1}\ \mathbf{p}_{2}\ \mathbf{p}_{3}]^{T}\) is of full rank so that its 3 row vectors are linear independent to each other. T z is then rank-4 since its 4 row vectors will also be linear independent. By (44), the quadratic form f z (q) consists of two positive squares and two negative squares. By the law of inertia for quadratic forms [6], the number of positive and negative squares are invariant to the choice of basis. Hence, Φ z is a rank-4 10×10 symmetric matrix with two positive and two negative eigenvalues.
9.2 Unit Aspect-Ratio Constraint
This constraint can be expressed in the form of cross product as
By applying the above two cross product properties, it can be shown that (45) is equivalent to
Similar to the development for zero-skew constraint, we can apply the Kronecker product on (45) to express (45) in terms of q as
Let v 5=p 1⊗p 3, v 6=p 2⊗p 3 and v 7=p 2⊗p 2−p 1⊗p 1. Then (46) can be simplified as
Reform the quadratic forms as a symmetric matrix, we have
where
and the symmetric matrix Φ u between the q T and q can be expressed as
By the law of inertia for quadratic forms [6], Φ u is also rank-4 with two positive and two negative eigenvalues.
9.3 Partial Constant Principal-Point Constraints
For a camera with fixed intrinsic parameters, the principal point is the same across all the taken images. In auto-focusing or zooming operations, the principal point may vary. However, calibration experiments suggest the variation of the principal point is small. Applying this assumption, any image pair from the same camera can provide two additional constraints (i.e. for the two components of the principal point in the 2D plane). As these constraints are across a pair of two images, let us denote the principal points as \([ u_{i}\;v_{i}\;1 ]^{T}\) for the ith view and \([ u_{j}\;v_{j}\;1 ]^{T}\) for the jth view.
If u i =u j , we have
where the subscript i and j are referred to ith and jth views respectively. Expressed in terms of \(\hat{{P}}\) and H 1 from (36), we have
Using Kronecker product, this becomes
Let v 8=p 3i ⊗p 1i , v 9=p 3j ⊗p 3j , v 10=p 3j ⊗p 1j and v 11=p 3i ⊗p 3i and reform it as quadratic form. We have
where
and
Clearly, \({\varPhi}_{ij}^{x}\) is at most of rank-4 and has two positive and two negative eigenvalues. From the second constraint v i =v j , we have
Expressed in terms of \(\hat{{P}}\) and H 1 from (36) and followed by Kronecker product, we have
Let v 12=p 3i ⊗p 2i , v 13=p 3j ⊗p 2j and reform it as quadratic form. We have
where
and
Clearly, \({\varPhi}_{ij}^{y}\) is at most of rank-4 and has two positive and two negative eigenvalues.
The above two constraints on the x, y-coordinates of the principal point can be applied independently and the constraints can be applied to any image pair captured by the same camera. There is, however, no restriction that the same camera is used to capture the whole image sequence.
10 Additional Constraints
In this section, we develop some additional constraints for a priori information about the cameras. The previous constraint expressing technique can be further applied on the new constraints. They can also be related in the same framework so that different constraints can be applied on different cameras even the new constraints are linear to the absolute dual quadric Q but not in quadratic form.
10.1 Known Principal Points
When the principal points are known for some views (or all views), it is possible to apply an 2D translation for those views such that the translated principal points become (0, 0). Assuming that the principal point of the ith view is known, after the translation of the principal point to the origin, the dual image of the absolute conic of the ith view becomes
Comparing (50) with (14), we can deduce two constraints on q as
Both conditions are linear in q and can be expressed as
where \({\varPhi}_{i}^{x0}\) and \({\varPhi}_{i}^{y0}\in \mathbb{R}^{1\times 10}\). Expressing these conditions by Kronecker product, we get
Both \({\varPhi}_{i}^{x_{0}}\) and \({\varPhi}_{i}^{y_{0}}\) are first scaled to unit norm vectors as \(\vert {\varPhi}_{i}^{x0}\vert =\vert {\varPhi}_{i}^{y0}\vert =1\). To integrate with the previous constraints in (26), the above linear constraints (52) should be transformed as \(\mathbf{q}^{T} \{ ({\varPhi}_{i}^{x0} )^{T} {\varPhi}_{i}^{x0} \} \mathbf{q}=0\), where \(({\varPhi}_{i}^{x0} )^{T} {\varPhi}_{i}^{x0} \in \mathbb{R}^{10 \times 10}\) is rank-1 matrix. These constraints can also be used independently to determine Q by at least 5 (>9/2) cameras having known principal points instead.
10.2 Known Principal Points and Euclidean Image Planes
An Euclidean image plane satisfies the zero-skew constraint and unit aspect-ratio constraint. Applying these two more constraints to (50), we have
It follows from (14) and (50) that the zero-skew constraint can be simply expressed as
and the unit aspect-ratio constraint can be written
Both conditions can be transformed as
where
and
After scaling both \({\varPhi}_{i}^{z0}\) and \({\varPhi}_{i}^{u0}\) to unit norm vectors as \(\vert {\varPhi}_{i}^{z0}\vert = \vert {\varPhi}_{i}^{u0}\vert =1\), they can also be rewritten in quadratic form as rank-1 matrices and integrated with the previous constraints in (26). Combining these two constraints with (52), Q can be solved independently by at least 3 views (>9/4) having known principal points and Euclidean image planes.
11 Conclusion
A flexible self-calibration algorithm is proposed to deal with the problem of recovering projective distortion matrix to upgrade a projective frame to a Euclidean frame. The common metric constraints for self-calibration have been unified in a common framework where they are represented as 4D ruled quadrics in a 10D space. This common framework is very flexible for customizing different metric constraints in different camera configurations. The projective distortion matrix can be obtained by minimizing a single cost function, namely (26). In practice, we proposed to minimize an upper bound of the cost function, and experiments show that the results are very satisfactory both in the case of synthetic data given in Sect. 8.1 and real data given in Sect. 8.2. The results should be further refined using different types of iterative non-linear algorithms or Euclidean bundle adjustment. The proposed method can provide a flexible and reliable starting point for further Euclidean bundle adjustments.
References
Bougnoux, S.: From projective to Euclidean space under any practical situation, a criticism of self-calibration. In: IEEE Int. Conf. Computer Vision, pp. 790–796 (1998)
Chen, G.Q., Medioni, G.G.: Practical algorithms for stratified structure-from-motion. Image Vis. Comput. 20, 103–123 (2002)
Faugeras, O.D.: Three-Dimensional Computer Vision: A Geometric Viewpoint. MIT Press, Cambridge (1993)
Faugeras, O.D., Luong, Q.-T., Maybank, S.J.: Camera self-calibration: theory and experiments. In: European Conf. on Computer Vision, SantaMargerita, Italy, pp. 321–334 (1992). citeseer.nj.nec.com/faugeras92camera.html
Fusiello, A.: A new autocalibration algorithm: experimental evaluation. In: Skarbek, W. (ed.) Computer Analysis of Images and Patterns. LNCS, vol. 2124, pp. 717–724. Springer, Berlin (2001)
Gantmacher, F.R.: The Theory of Matrices, vol. I. Chelsea, London (1959)
Gurdjos, P., Bartoli, A., Sturm, P.: Is dual linear self-calibration artificially ambiguous? In: IEEE Int. Conf. Computer Vision, Kyoto, Japan, pp. 88–95 (2009)
Han, M., Kanade, T.: Scene reconstruction from multiple uncalibrated views. Tech. Rep. CMU-RI-TR-00-09, Robotics Institute, Carnegie Mellon University (2000)
Hartley, R.I.: Euclidean reconstruction from uncalibrated views. In: European Conf. on Computer Vision, pp. 579–587 (1994)
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, Cambridge (2004). ISBN: 0521540518
Heyden, A.: Reconstruction from image sequences by means of relative depths. In: IEEE Int. Conf. Computer Vision, pp. 1058–1063 (1995). citeseer.nj.nec.com/article/heyden95reconstruction.html
Heyden, A., Åström, K.: Euclidean reconstruction from image sequences with varying and unknown focal length and principal point. In: IEEE Int. Conf. on Computer Vision & Pattern Recognition, San Juan, Puerto Rico, pp. 438–443 (1997)
Hung, Y.S., Tang, W.K.: Projective reconstruction from multiple views with minimization of 2D reprojection error. Int. J. Comput. Vis. 66(3), 305–317 (2006)
Maybank, S.J., Faugeras, O.D.: A theory of self-calibration of a moving camera. Int. J. Comput. Vis. 8(2), 123–151 (1992)
Mendonça, P., Cipolla, R.: A simple technique for self-calibration. In: IEEE Int. Conf. on Computer Vision & Pattern Recognition, vol. I, pp. 500–505 (1999)
Pollefeys, M.: Self-calibration and metric 3d reconstruction from uncalibrated image sequences. Ph.D. thesis, ESAT-PSI, KU Leuven (1999)
Pollefeys, M., Gool, L.V.: Stratified self-calibration with the modulus constraint. IEEE Trans. Pattern Anal. Mach. Intell. 21(8), 707–724 (1999)
Pollefeys, M., Koch, R., Gool, L.V.: Self-calibration and metric reconstruction inspite of varying and unknown intrinsic camera parameters. Int. J. Comput. Vis. 32(1), 7–25 (1999). http://www.springerlink.com/content/m614017373626881/
Pollefeys, M., Koch, R., Van Gool, L.: Self-calibration and metric reconstruction in spite of varying and unknown internal camera parameters. In: IEEE Int. Conf. Computer Vision, pp. 90–95 (1998)
Ponce, J.: On computing metric upgrades of projective reconstructions under the rectangular pixel assumption. In: Proc. of the SMILE 2000 Workshop on 3D Structure from Multiple Images of Large-Scale Environments. LNCS, vol. 2018, pp. 52–67 (2000). http://www.springerlink.com/content/kut09cu6qtykf3aq/
Sainz, M., Bagherzadeh, N., Susin, A.: Recovering 3d metric structure and motion from multiple uncalibrated cameras. In: Int. Conference on Information Technology: Coding and Computing, pp. 268–273 (2002)
Schaffalitzky, F.: Direct solution of modulus constraints. In: Proc. Indian Conf. on Computer Vision, Graphics and Image Processing, pp. 314–321 (2000). http://www.robots.ox.ac.uk/~vgg/vggpapers/Schaffalitzky2000b.ps.gz
Seo, Y., Heyden, A.: Auto-calibration from the orthogonality constraints. In: IEEE Int. Conf. Pattern Recognition, Barcelona, Spain, pp. 67–71 (2000)
Seo, Y., Heyden, A.: Auto-calibration by linear iteration using the dac equation. Image Vis. Comput. 22, 919–926 (2004)
Seo, Y., Hong, K.-S.: A linear metric reconstruction by complex eigen-decomposition. IEICE Trans. Inf. Syst. E84-D(12), 1626–1632 (2001)
Sturm, P.: Critical motion sequences for monocular self-calibration and uncalibrated Euclidean reconstruction. In: IEEE Int. Conf. on Computer Vision & Pattern Recognition, Puerto Rico, pp. 1100–1105 (1997)
Sturm, P.: A case against Kruppa’s equations for camera self-calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(10), 1199–1204 (2000). citeseer.nj.nec.com/sturm00case.html
Tang, W.K., Hung, Y.S.: A column-space approach to projective reconstruction. Comput. Vis. Image Underst. 101(3), 166–176 (2006)
Tang, W.K., Hung, Y.S.: A subspace method for projective reconstruction from multiple images with missing data. Image Vis. Comput. 54(5), 515–524 (2006)
Tomasi, C., Kanade, T.: Shape and motion from image streams under orthography: a factorization method. Int. J. Comput. Vis. 9(2), 137–154 (1992)
Triggs, B.: Autocalibration and the absolute quadric. In: IEEE Int. Conf. on Computer Vision & Pattern Recognition, pp. 609–614 (1997)
Zeller, C., Faugeras, O.: Camera self-calibration from video sequences: the Kruppa equations revisited. Tech. Rep. 2793, INRIA (1996)
Acknowledgements
The work described in this paper was supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. HKU712911E) and CRCG SPF of the University of Hong Kong.
Open Access
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Tang, A.W.K., Hung, Y.S. A Self-calibration Algorithm Based on a Unified Framework for Constraints on Multiple Views. J Math Imaging Vis 44, 432–448 (2012). https://doi.org/10.1007/s10851-012-0336-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10851-012-0336-0