1 Introduction

In the context of computer vision, the encoding of the essential geometry of situations in which multiple images of the same scene are taken from multiple cameras leads to interesting families of tensors, [1]. These multiview tensors have been introduced, in their most general form, by Hartley and Schaffalitzky [2], and they describe the multiview geometry of scenes in higher-dimensional spaces, with views taken as projections onto multiple target spaces of different dimensions. Special cases of these tensors are the classical fundamental matrix for two projections from \({\mathbb P}^{3}\) to \({\mathbb P}^{2},\) and the trifocal and quadrifocal tensor for, respectively, three and four projections from \({\mathbb P}^{3}\) to \({\mathbb P}^{2}.\) Properties of the classical fundamental matrix, the trifocal, and quadrifocal tensors, including the computation of their rank, (see [3] for a thorough discussion of the intricacy of the notion of rank of a tensor), and the dimension and equations of the variety that parameterizes these objects, have been investigated by several authors, [412]. Nothing is known in general about the rank of these tensors and about the variety that parameterizes them, as soon as the dimension of the ambient space is greater than or equal to 4. This paper is intended as a first step in the investigation of these questions for higher dimensions, and thus it is concerned with the Grassmann tensor for two projections from \({\mathbb P}^{k}\) to \({\mathbb P}^{h_1}\) and \({\mathbb P}^{h_2},\) with \(h_1+h_2 \ge k+1,\) and \(k \ge 3.\) In this case, the tensor of interest is a matrix, that is a generalization of the classical fundamental matrix. As in the classical case, this matrix gives a correspondence between proper linear subspaces of the two views and thus represents a map between suitable Grassmannians. Theorem 1 shows that, generalizing the classical case, the rank of such a matrix is not maximal and it is given by an explicit formula. In several significant cases, linear relations between rows (or columns) of the matrix, responsible for the drop in rank, are explicitly given. Such relations turn out to be obtained from generalized Plücker relations, [13].

Taking inspiration from [7], the dimension of the variety of generalized fundamental matrices is computed and such variety is shown to be birational to the quotient of an affine cone over a suitable Grassmannian. Moreover, as a generalization of a known result in the classical case, generalized fundamental matrices of rank 2 are completely characterized.

As it is well known, the relevance of Grassmann tensors in computer vision is due to their fundamental role in solving the problem of projective reconstruction, [2, 4, 1417]. For this reason, even if this work is primarily concerned with theoretical aspects in a complex projective setting, geometric algorithms to perform reconstruction in some cases are described.

The paper is structured as follows. Section 2 collects simple background material and sets basic notation. In Sect. 3, the generalized fundamental matrix is introduced and studied. In particular, its rank is computed in Theorem 1 and relations among its rows (columns) are investigated in Sect. 3.3. Section 4 studies the variety of generalized fundamental matrices, and, finally, Sect. 5 presents the algorithmic application of the generalized fundamental matrix to projective reconstruction.

2 Notation and background material

In this section, we collect all the necessary notation and background material from classical algebraic geometry and multiview projective geometry in the context of computer vision.

2.1 Projective spaces and Grassmannians

\({\mathbb P}^{k}\) denotes the k-dimensional complex (or real when specified) projective space. Whenever multiplication by a nonzero scalar is utilized, the scalar will be complex or real accordingly. Once a projective frame is chosen, coordinate vectors \(\mathbf {X}\) of points of \({\mathbb P}^{k}\) are written as columns, and thus their transpose is \(\mathbf {X}^T =(X_1,\ldots ,X_{k+1}).\) For sake of simplicity, sometimes we do not distinguish between points in \({\mathbb P}^{k}\) and vectors of their coordinates and, in this context, a basis for the underlying vector space \(\mathbb {C}^{k+1}\) (or \(\mathbb {R}^{k+1}\)) will be called a basis for \({\mathbb P}^{k}.\) A linear projective subspace \(\varLambda \subseteq {\mathbb P}^{k}\) spanned by \(m+1\) linearly independent points will be called m-space or subspace of dimension m. By convention, the empty set is considered as a \(({-}1)\)-space.

For integers \(0\le l\le m,\) \(G(l,m) = G(l,{\mathbb P}^{m})\) denotes the Grassmannian of the l-spaces in \({\mathbb P}^{m},\) i.e., the set parameterizing these subspaces. Each l-space \(L\subset {\mathbb P}^{m}\) is determined by \(l+1\) independent points \(\mathbf {Q_{i}}\), \(i=1,\ldots ,l+1.\) Denoting by \(\mathbf {Q_{i}}^T= q_{ij},\) \(j=1,\ldots , m+1,\) the homogeneous coordinates of \(\mathbf {Q_{i}}\), one can consider the \((l+1)\times (m+1)\) matrix \(\mathfrak {Q}=[q_{ij}]\) and its \({\left( {\begin{array}{c}m+1\\ l+1\end{array}}\right) }\) essential \((l+1)\times (l+1)\) minors. One usually denotes by \(p_{i_1,\ldots , i_{l+1}}\) the \((l+1)\times (l+1)\) minor of \(\mathfrak {Q}\) corresponding to the \(i_1,\ldots i_{l+1}\)-columns. Minors \(p_{i_1,\ldots , i_{l+1}}\) for which \(i_1< \cdots <i_{l+1}\) (in lexicographic order) are called Plücker coordinates of L and determine a point \(\mathbf {L}\) in a \({\mathbb P}^{{\left( {\begin{array}{c}m+1\\ l+1\end{array}}\right) }-1}.\) In this projective space, the Plücker coordinates of all l-spaces in \({\mathbb P}^{m}\) satisfy a set of algebraic equations defining a smooth algebraic variety of dimension \((l+1)(m-l)\) which is called the Plücker embedding of G(lm) in \({\mathbb P}^{{\left( {\begin{array}{c}m+1\\ l+1\end{array}}\right) }-1}.\) The map \(L \mapsto \mathbf {L},\) which associates with each l-space in \({\mathbb P}^{m}\) the vector of its coordinates as a point of \({\mathbb P}^{{\left( {\begin{array}{c}m+1\\ l+1\end{array}}\right) }-1}\), is called the Plücker map and it is one to one onto its image. For example: G(1, 3) is the Grassmannian of lines in \({\mathbb P}^{3}.\) Its Plücker embedding is a four-dimensional quadric in \({\mathbb P}^{5}\), and the Plücker map is a one-to-one map from G(1, 3) to this quadric. As a reference for all basic facts on Grassmannians, and more, we suggest [13].

In the sequel, we will need to consider some Schubert varieties in G(lm), for which we introduce the following notation. For a given s-space \(S \in {\mathbb P}^{m}, s < l,\) we will denote by \(\varOmega _{(l,m)}(S) \ \subset G(l,m)\) the variety of all l-spaces in \({\mathbb P}^{m}\) containing S. It turns out that \(\varOmega _{(l,m)}(S)\) is a variety of dimension \((m-l)(l-s).\)

Some standard facts concerning duality among Grassmannians will also be useful. Denote by \(\check{{\mathbb P}^{m}} = G(m-1,m)\) the m- projective space which is the dual of \({\mathbb P}^{m}.\) Points of \(\check{{\mathbb P}^{m}}\) represent hyperplanes of \({\mathbb P}^{m}.\) As seen above, a given l-space \(L \subset {\mathbb P}^{m}\) defines a point \(\mathbf {L} \in G(l,{\mathbb P}^{m}),\) but it can be dually described via (the intersection of) the set of all hyperplanes containing it, i.e. as an \((m-l-1)\)-space in \(\check{{\mathbb P}^{m}},\) or, which is the same, as a point of \(G(m-l-1,\check{{\mathbb P}^{m}}).\) This is the reason why Grassmannians G(lm) and \(G(m-l-1,m)\) are isomorphic and are said to be dual of each other. For the explicit correspondence between usual and dual Plücker coordinates of a given subspace, see [18, Vol I, Book II, p. 292],

2.2 Multiview geometry

For the convenience of the reader, we fix our notation for cameras, centers of projection, and multiple views in the context of projective reconstruction from multiple views in computer vision. A scene is a set of N points \(\{\mathbf {X_i}\}\in {\mathbb P}^{k}, i=1 \ldots N.\) A camera is represented as a central projection P of points in k-space, from a linear center \(C_P,\) onto a suitable \({\mathbb P}^{h},\) \(h< k,\) where in the traditional setting of real still images it is \(k=3\) and \(h=2.\) The target space \({\mathbb P}^{h},\) which in the traditional setting is usually identified with a physical image, is usually referred to as a view. We do not make any formal distinction between the projection map P and one of its matrix representations, for which we use the same symbol P. Accordingly, if \(\mathbf {X}\) is a point in \({\mathbb P}^{k},\) we denote its image in the projection equivalently as \(P(\mathbf {X})\) or \(P\cdot \mathbf {X}.\) The center of projection is denoted by \(C_P.\) In homogeneous coordinates, the projection mapping \(P:{\mathbb P}^{k}\setminus \{C_P \} \rightarrow {\mathbb P}^{h}\) is described by \(\mu \mathbf {x} = P \cdot \mathbf {X},\) where \(\mu \) is a nonzero constant, P is a \((h+1) \times (k+1)\)-matrix with \(\mathrm{rk}\,(P)=h+1,\) and \(C_P\) is the right annihilator of P,  hence a \((k-h-1)\)-space. For a given point \(\mathbf {X}\in {\mathbb P}^{k},\) the projecting ray, i.e. the join \({\langle }C_P, \mathbf {X}{\rangle },\) is a \((k-h)\)-space, as \(C_P\) is a \((k-h-1)\)-space. Let \(P^{+}\) denote a pseudoinverse for P,  i.e. a  matrix such that \(P \cdot P^{+}\) is the identity matrix. Then, for a given \(\mathbf {x}\in {\mathbb P}^{h},\) \( P^{-1}(\mathbf {x})={\langle }P^{+}(\mathbf {x}),C_{P}{\rangle }.\)

While in many applications one naturally considers projections from \({\mathbb P}^{k}\) to several spaces all of the same dimension (i.e. one considers a set of multiple images of the same kind of a scene, traditionally 3 or 4 real projections from \({\mathbb P}^{3}\) to \({\mathbb P}^{2}\)), the general setup allows for projections \(Pj:{\mathbb P}^{k}\setminus C_{Pj} \rightarrow {\mathbb P}^{h_j}\) to spaces of different dimensions. Two different images \(Pj(\mathbf {X})\) and \(Pm(\mathbf {X})\) of the same point \(\mathbf {X}\) are corresponding points, while \({\langle }C_{Pj},\mathbf {X}{\rangle }\) and \({\langle }C_{Pm},\mathbf {X}{\rangle }\) are corresponding rays. More generally, r linear subspaces \(S_i \subset {\mathbb P}^{h_i},\) \(i=1,\ldots , r\) are said to be corresponding if there exists at least one point \(\mathbf {X} \in {\mathbb P}^{k}\) such that \(Pi(\mathbf {X})\in S_i\) for \(i=1,\ldots , r.\)

Hartley and Schaffalitzky [2] gave a comprehensive theoretical framework for the study of multiview geometry, in any dimension. Here we recall the basic elements of their approach. Consider, again, a set of projections \(Pj:{\mathbb P}^{k}\setminus {C_{Pj}} \rightarrow {\mathbb P}^{h_j},\) \(j = 1,\ldots ,r,\) \(h_j \ge 2\) and a partition \((\alpha _1, \alpha _2, \ldots , \alpha _r)\) of \(k+1,\) i.e. \(1 \le \alpha _j \le h_j\) for all j,  and \(\sum \alpha _j = k+1.\) Let \(\{L_j\},\) \(j=1,\ldots ,r,\) be a set of general \(s_j\)-spaces, with \(L_j \subset {\mathbb P}^{h_j},\) and \(s_j=h_j-\alpha _j\).

Hartley and Schaffalitzky show that imposing to \(\{L_j\}\) to be a set of corresponding subspaces, with the assumption that \(\sum \alpha _j = k+1,\) one gets a multilinear relation among the Grassmann (Plücker) coordinates of the \(L_j.\) The rationale for the above assumption is essentially an application of Grassmann formula. Indeed, if \(\sum \alpha _j < k+1\), then any r-tuple of general subspaces with the given codimension \(\alpha _j\) would be corresponding, while if \(\sum \alpha _j > k+1\) imposing to an r-tuple of subspaces to be corresponding would generate more than one multilinear constraint.

The coefficients of these relations are then packaged into a multi-view tensor, called a Grassmann tensor with profile \((\alpha _1, \ldots , \alpha _r).\) This Grassmann tensor in the case of two, three, and four views from \({\mathbb P}^{3}\) to \({\mathbb P}^{2}\) is the classical fundamental matrix, trifocal tensor, and quadrifocal tensor, respectively. As mentioned in the introduction, all these objects have been widely studied in different contexts by many authors [412]. This work is concerned with the Grassmann tensor in the case of projections from a space of any dimension onto two views of possibly different dimensions.

3 Generalized fundamental matrix

3.1 Definition and explicit construction

Let us now consider two projections \(P1=A\) and \(P2=B\) from \({\mathbb P}^{k}\) to \({\mathbb P}^{h_1}\) and to \({\mathbb P}^{h_2},\) respectively, where \(h_1+h_2 \ge k+1,\) and where A and B are maximal rank matrices as follows:

$$\begin{aligned} A=\left( \begin{array}{cccc} a_{1,1} &{} a_{1,2} &{} \cdots &{} a_{1,k+1} \\ a_{2,1} &{} \cdots &{} \cdots &{} a_{2,k+1} \\ \vdots &{} \vdots &{} \vdots &{} \vdots \\ a_{h_1+1,1} &{} \cdots &{} \cdots &{} a_{h_2+1,k+1} \\ \end{array} \right) B=\left( \begin{array}{cccc} b_{1,1} &{} b_{1,2} &{} \cdots &{} b_{1,k+1} \\ b_{2,1} &{} \cdots &{} \cdots &{} b_{2,k+1} \\ \vdots &{} \vdots &{} \vdots &{} \vdots \\ b_{h_2+1,1} &{} \cdots &{} \cdots &{} b_{h_2+1,k+1} \\ \end{array} \right) . \end{aligned}$$

Grassmann formula shows that our assumption \(h_1+h_2 \ge k+1,\) in particular, implies that, for generic choices of A and B,  the projection centers \(C_A\) and \(C_B\) do not intersect. Their images \(E_1 = A(C_B)\) and \( E_2=B(C_A)\) are subspaces of dimension \(k-h_i-1\) of the view spaces, usually called epipoles.

Following [2], we choose a profile \((\alpha _1,\alpha _2),\) where, without loss of generality, \(\alpha _1 \ge \alpha _2,\) and with \(\alpha _1+\alpha _2=k+1,\) in order to obtain the constraints necessary to determine the corresponding tensor. In this case, the tensor is simply a matrix which we will call, following the classical case, generalized fundamental matrix.

Let \(L \subset {\mathbb P}^{h_1}, L' \subset {\mathbb P}^{h_2}\) be linear subspaces of dimension \(s_1=h_1-\alpha _1, s_2=h_2-\alpha _2,\) respectively, and let \(\mathbf {L} \in G(s_1,h_1) \subseteq {\mathbb P}^{\left( {\begin{array}{c}h_1+1\\ s_1+1\end{array}}\right) -1}\) and \(\mathbf {L}' \in G(s_2,h_2) \subseteq {\mathbb P}^{\left( {\begin{array}{c}h_2+1\\ s_2+1\end{array}}\right) -1}\) be their associated points in the appropriate Grassmannians.

Choose coordinates \((x_1,\ldots ,x_{h_1+1})\) and \((y_1,\ldots ,y_{h_2+1})\) in the two views \({\mathbb P}^{h_1}\) and \({\mathbb P}^{h_2},\) respectively. Given \(s_1+1\) points \(\mathbf {x^{j}}=(x^j_1,\ldots , x^j_{h_1+1}), j=1,\ldots ,s_1+1\) in \({\mathbb P}^{h_1}\) generating L and \(s_2+1\) points \(\mathbf {y^{l}}=(y^l_1,\ldots , y^l_{h_2+1}), l=1,\ldots ,s_2+1\) in \({\mathbb P}^{h_2}\) generating \(L',\) consider the following \((h_1+h_2+2) \times (h_1+h_2+2)\) square matrix:

$$\begin{aligned} M_{L,L'}=\left( \begin{array}{cccccccccccc} a_{1,1} &{} \cdots &{} a_{1,k+1}&{} x_1^1 &{} \cdots &{} x_1^{s_1+1} &{} 0 &{} \cdots &{} 0 \\ a_{2,1} &{} \cdots &{} a_{2,k+1} &{} x_2^1 &{} \cdots &{} x_2^{s_1+1} &{} 0 &{} \cdots &{} 0 \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots \\ a_{h_1+1,1} &{} \cdots &{} a_{h_1+1,k+1} &{} x_{h_1+1}^{1} &{} \cdots &{} x_{h_1+1}^{s_1+1} &{} 0 &{} \cdots &{} 0 \\ b_{1,1} &{} \cdots &{} b_{1,k+1} &{} 0 &{} \cdots &{} 0 &{} y_1^1 &{} \cdots &{} y_1^{s_2+1} \\ b_{2,1} &{} \cdots &{} b_{2,k+1}&{} 0 &{} \cdots &{} 0 &{} y_2^1 &{} \cdots &{} y_{2}^{s_2+1} \\ \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{} \vdots &{}\vdots &{} \vdots &{} \vdots \\ b_{h_2+1,1} &{} \cdots &{} b_{h_2+1,k+1} &{} 0 &{} \cdots &{} 0 &{} y_{h_2+1}^{1} &{} \cdots &{} y_{h_2+1}^{s_2+1} \\ \end{array} \right) \end{aligned}$$

Remark 1

  1. a)

    If L and \(L'\) are corresponding spaces then the linear system

    $$\begin{aligned} M_{L,L'} \cdot \left[ \begin{array}{c} \mathbf {X} \\ \lambda _1 \\ \vdots \\ \lambda _{s_1+1}\\ \mu _1 \\ \vdots \\ \mu _{s_2+1} \\ \end{array} \right] =\mathbf {0} \end{aligned}$$
    (1)

    has a non-trivial solution, hence \(\det (M_{L,L'})=0.\) Indeed L and \(L'\) are corresponding if and only if there exists a point \(\mathbf {X} \in {\mathbb P}^{k}\) such that \(A \cdot \mathbf {X} \in L\) and \(B \cdot \mathbf {X} \in L'.\)

  2. b)

    If either \(L \cap E_1 \ne \emptyset ,\) or \(L' \cap E_2 \ne \emptyset ,\) then \(\det (M_{L,L'})=0.\) Indeed, if, e.g., \(\mathbf {e_2} \in L' \cap E_2\), then \(\mathbf {e_2} = B\cdot \hat{\mathbf {X}},\) for some \(\hat{\mathbf {X}} \in C_A\) and \(\mathbf {e_2} = \sum _j\mu _j\mathbf {y^j},\) for some \(\mu _j.\) Then one can replace \(\mathbf {y^{s_2+1}}\) with \(\mathbf {e_2}\) in the matrix \(M_{L,L'}.\) In this case, the last column of \(M_{L,L'}\) is \(\left[ \begin{array}{c} A\cdot \hat{\mathbf {X}} \\ B\cdot \hat{\mathbf {X}} \\ \end{array} \right] \) which is clearly linearly dependent from the first \(k+1\) columns.

  3. c)

    If \(L \cap E_1 = \emptyset , L' \cap E_2 =\emptyset \) and \(\det (M_{L,L'})=0,\) then L and \(L'\) are corresponding. Indeed, with these assumptions, the linear system (1) has a non-trivial solution with, in particular, \(\mathbf {X} \ne \mathbf {0}.\) Indeed, if \(\mathbf {X} = \mathbf {0}\) either \(\lambda _j \ne 0\) for some j or \(\mu _j\ne 0\) for some j. Either of these cases would contradict the linear independence of the points chosen to generate L and \(L'.\)

In view of Remark 1, one sees that \(\det (M_{L,L'})=0\) is indeed the bilinear constraint that encodes the correspondence between subspaces of the two views.

Let \(I=(i_1, \ldots , i_{s_1+1}),\) \(J=(j_1, \ldots , j_{s_2+1}),\) \(\hat{J}=(h_1+1+j_1, \ldots , h_1+1+j_{s_2+1})\) with \(1 \le i_1< \cdots < i_{s_1+1} \le h_1+1\) and \(1 \le j_1< \cdots < j_{s_2+1} \le h_2+1.\) Denote by \(I', \hat{J}'\) the (ordered) sets of complementary indices \(I'= \{r \in \{1, \ldots , h_1+1 \}\) such that \(r \notin I\}\) e \(\hat{J}'= \{ s \in \{h_1+2, \ldots , h_1+h_2+2 \} \text { such that } s \notin \hat{J}\}\). Moreover denote by \(A_I\) and \(B_J,\) respectively, the matrices obtained from A and B deleting rows \(i_1, \ldots , i_{s_1+1}\) and \(j_1, \ldots , j_{s_2+1},\) respectively.

Computing \(\det (M_{L,L'})\) with an iterated application of the generalized Laplace expansion, one gets:

$$\begin{aligned} det(M_{L,L'})= \sum _{I,J}{\lambda }_I F_{I,J} {{\lambda }'}_J \end{aligned}$$

where \({\lambda }_I=p_{i_1,\ldots ,i_{s_1+1}}\) are Plücker coordinates of L, and similarly for \(\lambda '_J\) and \(L'\), and where the \(F_{I,J}\) are given by:

$$\begin{aligned} F_{I,J}=\epsilon (I,J) \det \left( \begin{array}{c} A_I \\ B_J \\ \end{array} \right) \end{aligned}$$
(2)

where \(\epsilon (I,J)\) is \(+1\) or \(-1\) according to the parity of the permutation \((I,\hat{J}, I',\hat{J}').\)

Notice that \(F_{I,J}\) are suitable maximal minors of the matrix \(\left[ \begin{array}{c} A \\ B \\ \end{array} \right] \) so that they can be thought of as some of the Plücker coordinates of the k-subspace \(\varLambda _{AB} \subset {\mathbb P}^{h_1+h_2+1},\) spanned by the columns of the above matrix. From the construction above, and recalling the relationship between Plücker and dual Plücker coordinates, see [18, Vol I, Book II, p.292], one sees that \(F_{I,J}\) are the dual Plücker coordinates \(q_{I,\hat{J}}(\varLambda _{AB}).\)

The above discussion laid the groundwork for the following definition, using the same notation.

Definition 1

The generalized fundamental matrix for two projections AB from \({\mathbb P}^{k}\) to \({\mathbb P}^{h_1}\) and \({\mathbb P}^{h_2},\) with profile \((\alpha _1, \alpha _2),\) is, up to a multiplicative nonzero constant, the \(\left( {\begin{array}{c}h_1+1\\ h_1-\alpha _1+1\end{array}}\right) \times \left( {\begin{array}{c}h_2+1\\ h_2-\alpha _2+1\end{array}}\right) \) matrix \(\mathfrak {F},\) whose entries are \(F_{I,J}\) are the dual Plücker coordinates \(q_{I,\hat{J}}(\varLambda _{AB})\) with lexicographical order of I for the rows and \(\hat{J}\) for the columns.

Example 1

In the classical case of two projections from \({\mathbb P}^{3}\) to \({\mathbb P}^{2},\) with \(\alpha _1=\alpha _2 = 2,\) the matrix \(\left[ \begin{array}{c} A \\ B \\ \end{array} \right] \) has dimension \(6 \times 4.\) The subspace \(\varLambda _{AB}\) is in \(G(3,5) \subset {\mathbb P}^{14} \), hence its 15 Plücker coordinates are \(q_{r_1,r_2}\) with \(1 \le r_1 < r_2 \le 6,\) while the entries of the fundamental matrix are only the nine coordinates \(q_{i_1,j_1}\) with \(1 \le i_1 \le 3\) and \(4 \le j_1 \le 6\), indeed one has:

$$\begin{aligned} \mathfrak {F} = \left( \begin{array}{lll} q_{1,4} &{}\quad q_{1,5} &{}\quad q_{1,6} \\ q_{2,4} &{}\quad q_{2,5} &{}\quad q_{2,6}\\ q_{3,4} &{}\quad q_{3,5} &{}\quad q_{3,6} \\ \end{array} \right) . \end{aligned}$$

Example 2

Consider two projections from \({\mathbb P}^{4}\) to \({\mathbb P}^{3}\) with profile (3, 2). In this case, the matrix \(\left[ \begin{array}{c} A \\ B \\ \end{array} \right] \) has dimension \(8 \times 5.\) The subspace \(\varLambda _{AB}\) is in \(G(4,7) \subset {\mathbb P}^{{\left( {\begin{array}{c}8\\ 5\end{array}}\right) }-1},\) and the fundamental matrix \(\mathfrak {F}\) is:

$$\begin{aligned} \mathfrak {F} = \left( \begin{array}{llllll} q_{1,5,6} &{}\quad q_{1,5,7}&{}\quad q_{1,5,8} &{}\quad q_{1,6,7} &{}\quad q_{1,6,8} &{}\quad q_{1,7,8} \\ q_{2,5,6} &{}\quad q_{2,5,7}&{}\quad q_{2,5,8} &{}\quad q_{2,6,7} &{}\quad q_{2,6,8} &{}\quad q_{2,7,8} \\ q_{3,5,6} &{}\quad q_{3,5,7}&{}\quad q_{3,5,8} &{}\quad q_{3,6,7} &{}\quad q_{3,6,8} &{}\quad q_{3,7,8} \\ q_{4,5,6} &{}\quad q_{4,5,7}&{}\quad q_{4,5,8} &{}\quad q_{4,6,7} &{}\quad q_{4,6,8} &{}\quad q_{4,7,8} \\ \end{array} \right) . \end{aligned}$$

Example 3

Finally, we consider two projections from \({\mathbb P}^{5}\) to \({\mathbb P}^{h_1}={\mathbb P}^{4}\) and \({\mathbb P}^{h_2}={\mathbb P}^{3}\) with profile (3, 3). In this case the fundamental matrix \(\mathfrak {F}\) has dimension \(9 \times 6\) and is of the form

$$\begin{aligned} \mathfrak {F} = \left( \begin{array}{llll} q_{1,2,6} &{}\quad q_{1,2,7}&{}\quad q_{1,2,8} &{}\quad q_{1,2,9} \\ q_{1,3,6} &{}\quad q_{1,3,7}&{}\quad q_{1,3,8} &{}\quad q_{1,3,9} \\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots \\ q_{4,5,6} &{}\quad q_{4,5,7}&{}\quad q_{4,5,8} &{}\quad q_{4,5,9} \\ \end{array} \right) . \end{aligned}$$

The classical fundamental matrix F for two projections from \({\mathbb P}^{3}\) to \({\mathbb P}^{2},\) besides its role as a bilinear form \(\mathbf {x}^T\cdot F \cdot {\mathbf {x}}',\) vanishing on pairs of corresponding points on the two views, can also be viewed as a map. Indeed F can be interpreted as mapping a point \(\mathbf {x}\ne E_1\) in the first view to the line \(\mathbf {x}^T\cdot F,\) passing through \(E_2,\) and containing all points \(\mathbf {x'}\) who are corresponding to \(\mathbf {x}.\) Hence F defines a rational map \({\mathbb P}^{2} \dashrightarrow \check{{\mathbb P}^{2}}\) whose image is the pencil of lines through \(E_2.\)

Similarly, the generalized fundamental matrix is, on the one hand, the matrix of a bilinear form acting on pairs \((L,L')\) where L and \(L'\) are corresponding linear spaces in the two views and, on the other hand, can also be viewed as a rational map associating to a linear space L,  such that \(L\cap E_1= \emptyset ,\) the linear space \(B(A^{-1}(L))\) which, for a generic L, has dimension \(k-\alpha _1.\) This interpretation of \(\mathfrak {F}\) as a map is investigated in the next section where, in particular, it will be useful in computing \(\mathrm{rk}\,(\mathfrak {F})\).

3.2 Geometric determination of \(\mathrm{rk}\,(\mathfrak {F})\)

The determination of the rank of a tensor is an interesting and usually quite intricate problem. The rank of the classical fundamental matrix is well known to be 2. The rank of the trifocal tensor was determined to be 4, while the rank of the quadrifocal tensor turns out to be 9, [10]. Nothing is known in general about the ranks of generalized Grassmann tensors. In this section, we compute the rank of the generalized fundamental matrix as a first step in this direction. In the same notation as the previous section, one has the following Theorem.

Theorem 1

The generalized fundamental matrix \(\mathfrak {F}\) for two projections of maximal rank and whose centers do not intersect each other, with profile \((\alpha _1,\alpha _2),\) defines a rational map \(\varPhi : G(s_1,h_1) \dashrightarrow G(k-\alpha _1, h_2)\) whose image is the Schubert variety \(\varOmega _{(k-\alpha _1,h_2)}(E_2).\) Moreover it is:

$$\begin{aligned} \mathrm{rk}\,(\mathfrak {F}) =\left( {\begin{array}{c}(h_1-\alpha _1+1)+(h_2-\alpha _2+1)\\ h_1-\alpha _1+1\end{array}}\right) . \end{aligned}$$

Proof

Consider \(\mathfrak {F}\) as the matrix of a rational map \(\varPhi : G(s_1,h_1) \dashrightarrow G(k-\alpha _1, h_2)\) defined as \(\varPhi ({\mathbf {L}})=\hat{{\mathbf {L}}} \in G(k-\alpha _1,h_2) = G(\alpha _2+1,h_2) \subseteq {\mathbb P}^{N_2},\) where \(\hat{\mathbf {L}}\) are the Plücker coordinates of \(B(A^{-1}(L)).\) As we have seen in Remark 1, this map is not defined on the points of \(G(h_1-\alpha _1,h_1)\) which correspond to linear spaces which intersect \(E_1\). Due to the assumption \(\alpha _1+\alpha _2=k+1,\) the open subset of \(G(h_1-\alpha _1,h_1)\) where the map is defined is not empty.

As in the classical case, the linear space \(B(A^{-1}(L))\) contains \(E_2. \) This implies that when \(\mathbf {L}\) varies in \(G(s_1,h_1)\), the image \(\varPhi (\mathbf {L})\) varies in \(\varOmega _{(k-\alpha _1,h_2)}(E_2),\) which turns out to be a \((h_2+\alpha _1-k)(h_1-\alpha _1+1)\)-variety spanning a linear space \({\mathbb P}^{\omega }\) where \(\omega =\left( {\begin{array}{c}(h_1-\alpha _1+1)+(h_2-\alpha _2+1)\\ h_1-\alpha _1+1\end{array}}\right) -1=\left( {\begin{array}{c}(h_1+h_2-k+1)\\ h_1-\alpha _1+1\end{array}}\right) -1\). This dimension can be computed using Proposition 3 in [13] to determine the number of independent linear conditions which define \(\varOmega _{(k-\alpha _1,h_2)}(E_2)\) inside the Grassmannian.

In Plücker coordinates, the map \(\varPhi \) is defined by \(\varPhi (\mathbf {L})= {\mathbf {L}}^T \cdot \mathfrak {F}\) and extends to a rational map \(\widetilde{\varPhi }:{\mathbb P}^{N_1} \dashrightarrow {\mathbb P}^{N_2},\) where \(N_1 = \left( {\begin{array}{c}h_1+1\\ h_1-\alpha _1+1\end{array}}\right) -1\) and \(N_2=\left( {\begin{array}{c}h_2+1\\ h_2-\alpha _2+1\end{array}}\right) -1.\) The image of \(\widetilde{\varPhi }\) is contained in \({\mathbb P}^{\omega }\) because the Grassmannian \(G(s_1,h_1)\) is not contained in any hyperplane and hence it spans all of \({\mathbb P}^{N_1}.\) Moreover, it is not difficult to see that \(\overline{Im({\varPhi })} = \varOmega _{(k-\alpha _1,h_2)}(E_2),\) hence \({\varPhi }\) is dominant on \(\varOmega _{(k-\alpha _1,h_2)}(E_2)\) and this implies that \(\widetilde{\varPhi }\) is dominant on \({\mathbb P}^{\omega }\). Indeed for any \(\varvec{\Lambda } \in \varOmega _{(k-\alpha _1,h_2)}(E_2)\) one has \(\varvec{\Lambda } = \varPhi (\mathbf {L})=\widetilde{\varPhi }(\mathbf {L}),\) for any \(\mathbf {L} \subset A(B^{-1}(\varLambda )),\) where, according to our notation, \(\varvec{\Lambda }\) denotes the vector of Plücker coordinates of the linear space \(\varLambda .\) This implies that the fundamental matrix has rank: \(\mathrm{rk}\,(\mathfrak {F}) = \omega + 1.\) \(\square \)

Remark 2

Notice that, as in the classical case, the generalized fundamental matrix is not of maximal rank. From the geometric point of view, this corresponds to the fact that the fiber \(\varPhi ^{-1}(\varvec{\Lambda })\), for a given generic \(\varvec{\Lambda } \in \varOmega _{(k-\alpha _1,h_2)}(E_2)\), turns out to be the Grassmann variety \(G(h_1-\alpha _1,\alpha _2-1)\) of the \((h_1-\alpha _1)\)-spaces contained in \(A(B^{-1}(\varLambda ))\). This implies that the dimension of the fiber \(\varPhi ^{-1}(\varvec{\Lambda })\) and, a fortiori, of the fiber \(\widetilde{\varPhi }^{-1}(\varvec{\Lambda })\) is positive.

Remark 3

When the hypothesis of Theorem 1 on skew centers of projections and maximal rank of the projection matrices are not satisfied, the generalized fundamental matrix could a priori have rank lower than the expected one. In reality, either \(\mathfrak {F}\) is identically zero or it cannot even be constructed.

If \(P \in C_A \cap C_B,\) one can assume, up to projective transformations in \({\mathbb P}^{k}\), that \(P=(1:0: \cdots :0)\). In this case, both projection matrices must have a null first column and hence from (2) it follows \(\mathfrak {F}=0.\)

In the case of projection matrices with non-maximal rank, one can see that the Grassmann tensor cannot even be constructed. Indeed, for a given profile \((\alpha _1=h_1-s_1,\alpha _2=h_2-s_2)\), with \(\alpha _1+\alpha _2=k+1\), as the dimension of the subspaces \({\mathbb P}^{h'_1}\) and \({\mathbb P}^{h'_2}\) of \({\mathbb P}^{h_1}\) and \({\mathbb P}^{h_2}\) respectively, which are the images of the two projections, decreases, the corresponding codimensions \((\alpha '_1=h'_1-s_1,\alpha '_2=h'_2-s_2)\) do not satisfy the condition \(\alpha '_1+\alpha '_2=k+1\) under which it is possible to define a Grassmann tensor.

3.3 Plücker relations among rows of \(\mathfrak {F}\)

Theorem 1 shows that \(\mathfrak {F}\) does not have maximal rank. It is natural to wonder about the nature of the linear relations existing among rows (or columns) of the generalized fundamental matrix, responsible for the dropping of the rank. In this section we determine such relations, under suitable assumptions on \(h_i\) and \(\alpha _i,\) showing that they are consequences of well-known generalized Plucker relations (in dual coordinates). In order to do that, we recall, [13, p. 1076], that for a Grassmannian G(dn), the following relations hold:

$$\begin{aligned} \sum _{\sigma }sgn(\sigma )q_{i_1 \ldots i_{\lambda },\sigma i_{\lambda +1} \ldots \sigma i_{d+1}}q_{\sigma j_{1} \ldots \sigma j_{\lambda +1},j_{\lambda +2} \ldots j_{d+1}} =0, \end{aligned}$$
(3)

where the sum ranges over all permutations \(\sigma \) of \((i_{\lambda +1} \ldots i_{d+1},j_{1} \ldots j_{\lambda +1})\) such that \(\sigma i_{\lambda +1}< \cdots < \sigma i_{d+1}\) and \(\sigma j_{1}< \cdots < \sigma j_{\lambda +1}\).

3.3.1 Case \(h_1=h_2=h\) and profile (hh)

In this situation, the matrix \(\mathfrak {F}\) is a square matrix of order \(h+1,\) of rank 2 (by Theorem 1), whose entry in position (ij) is \(f_{ij}=q_{i,(h+1)+j}.\) Considering any three rows of \(\mathfrak {F}\):

$$\begin{aligned} R_a= & {} [q_{a,(h+1)+1}, q_{a,(h+1)+2}, \ldots , q_{a,(h+1)+h+1}]\\ R_b= & {} [q_{b,(h+1)+1}, q_{b,(h+1)+2}, \ldots , q_{b,(h+1)+h+1}]\\ R_c= & {} [q_{c,(h+1)+1}, q_{c,(h+1)+2}, \ldots , q_{c,(h+1)+h+1}] \end{aligned}$$

with \(1 \le a,b,c \le h+1\), the generalized Plucker relations (3) imply the relation:

$$\begin{aligned} q_{b,c}R_a-q_{a,c}R_b+q_{a,b}R_c=0. \end{aligned}$$

Notice that when \(k=3\) and \(h=2\), one has the classical scenario described in Example 1.

3.3.2 Case \(h_1=h_2=h\) and profile \((h,h-1)\)

In this case, \(\mathfrak {F}\) has dimension \((h+1) \times \frac{h(h+1)}{2}\) and rank 3 (by Theorem 1). It is:

$$\begin{aligned} \mathfrak {F} =\left( \begin{array}{cccccc} q_{1,(h+1)+1,(h+1)+2} &{} q_{1,(h+1)+1,(h+1)+3}&{} \cdots &{} q_{1,(h+1)+h,(h+1)+(h+1)} \\ q_{2,(h+1)+1,(h+1)+2} &{} q_{2,(h+1)+1,(h+1)+3}&{} \cdots &{} q_{2,(h+1)+h,(h+1)+(h+1)} \\ \vdots &{} \vdots &{} \cdots &{} \vdots \\ q_{h+1,(h+1)+1,(h+1)+2} &{} q_{h+1,(h+1)+1,(h+1)+3}&{} \cdots &{} q_{h+1,(h+1)+h,(h+1)+(h+1)} \\ \end{array} \right) \end{aligned}$$

For any four rows of \(\mathfrak {F}\):

$$\begin{aligned} R_a= & {} [q_{a,(h+1)+1,(h+1)+2}, q_{a,(h+1)+1,(h+1)+3}, \ldots , q_{a,(h+1)+h,(h+1)+(h+1)}]\\ R_b= & {} [q_{b,(h+1)+1,(h+1)+2}, q_{b,(h+1)+1,(h+1)+3}, \ldots , q_{b,(h+1)+h,(h+1)+(h+1)}]\\ R_c= & {} [q_{c,(h+1)+1,(h+1)+2}, q_{c,(h+1)+1,(h+1)+3}, \ldots , q_{c,(h+1)+h,(h+1)+(h+1)}]\\ R_d= & {} [q_{d,(h+1)+1,(h+1)+2}, q_{d,(h+1)+1,(h+1)+3}, \ldots , q_{d,(h+1)+h,(h+1)+(h+1)}] \end{aligned}$$

with \(1 \le a,b,c,d \le h+1\), the generalized Plucker relations (3) imply the relation:

$$\begin{aligned} q_{b,c,d}R_a-q_{a,c,d}R_b+q_{a,b,d}R_c-q_{a,b,c}R_d=0. \end{aligned}$$

Notice that for \(k=4\) and \(h=3\) one has the same situation as in Example 2.

3.3.3 Case \(h_1=h_2=h\), \(k=h+1\) and any profile \((\alpha _1,\alpha _2) =(h-t,t+2)\)

In this case, we can consider any profile \((\alpha _1,\alpha _2) = (h-t,t+2)\), with \( 0 \le t \le \frac{h-2}{2}.\) Notice that the assumption \(k=h+1\) is equivalent to the condition that the number of rows of \(\mathfrak {F}\), \(\left( {\begin{array}{c}h+1\\ h-\alpha _1 +1\end{array}}\right) ,\) is exactly the number \(\left( {\begin{array}{c}d+2\\ \lambda +1\end{array}}\right) \) of summands appearing in relations (3). Hence, in this case, each Plücker relation involves all of the rows of \(\mathfrak {F}.\)

Theorem 1 gives \(rk(\mathfrak {F})= \left( {\begin{array}{c}h\\ t+1\end{array}}\right) \). Here we show, by induction on t, that all relations among the rows of \(\mathfrak {F}\) are generated by generalized Plücker relations as in (3). Notice that one can consider relations (3) as linear equations in the entries of the fundamental matrix \(q_{\sigma j_{1} \ldots \sigma j_{\lambda +1},j_{\lambda +2} \ldots j_{d+1}},\) with coefficients \(q_{i_1 \ldots i_{\lambda },\sigma i_{\lambda +1} \ldots \sigma i_{d+1}},\) which are themselves Plücker coordinates, and which do not depend on \(j_{\lambda +2} \ldots j_{d+1},\) so that they are invariant for all the elements of a given column of \(\mathfrak {F}\).

Hence relations (3) give rise to linear relations among the rows of \(\mathfrak {F}\). More precisely, under the above assumptions, for a profile \((h-t,t+2)\), the matrix \(\mathfrak {F}\) has dimension \(\left( {\begin{array}{c}h+1\\ h-t\end{array}}\right) \times \left( {\begin{array}{c}h+1\\ t+2\end{array}}\right) \) and its entries are \(f_{I,J}\) with \(I=(i_1, \ldots , i_{t+1})\) and \(J=(j_1, \ldots , j_{h-t-1}),\) so that, in the generalized Plucker relations (3), with \(\lambda =t\) and \(d=h-1\), the entries of \(\mathfrak {F}\) can be identified with the elements \(q_{\sigma j_1 \ldots \sigma j_{t+1},j_{t+2} \ldots j_{h+2}}\) and the coefficients of (3) can be organized in the following \(\left( {\begin{array}{c}2h+2\\ t+2\end{array}}\right) \times \left( {\begin{array}{c}h+1\\ t+3\end{array}}\right) \) matrix \(\mathfrak {R(F)}=[ q_{i_1 \ldots i_{t},\sigma i_{t+1} \ldots \sigma i_{h+2}}].\) The rows of \(\mathfrak {R(F)}\) depend on the choice of the indices \(i_1 \ldots i_{t}\) among \(2h+2\) elements corresponding to the rows of the matrix \(\left[ \begin{array}{c} A \\ B \\ \end{array} \right] \). The proof hinges on the fact that, if we restrict the choice only to the \(h+1\) rows of the matrix A, we obtain a \(\left( {\begin{array}{c}h+1\\ t+2\end{array}}\right) \times \left( {\begin{array}{c}h+1\\ t+3\end{array}}\right) =\left( {\begin{array}{c}h+1\\ h-t-1\end{array}}\right) \times \left( {\begin{array}{c}h+1\\ t+3\end{array}}\right) \) submatrix \(\mathfrak {R_0}\) which comes out to be the fundamental matrix corresponding to the two projections B and A,  in reversed order, with profile \(\alpha _1=h-(t-1), \alpha _2=(t-1)+2\). Hence one can apply induction on t to prove that rk\((\mathfrak {F}_t)=\left( {\begin{array}{c}h\\ t+1\end{array}}\right) \), where \(\mathfrak {F}_t\) denotes the generalized fundamental matrix for the profile \((h-t,t+2)\). Analogously, let us denote by \(\mathfrak {R_0}_t\) the matrix of the Plücker relations considered above, for the profile \((h-t,t+2)\).

  • If \(t=0,\) then \(\mathfrak {F}_0\) has dimension \((h+1) \times \left( {\begin{array}{c}h+1\\ 2\end{array}}\right) \) and \(\mathfrak {R_0}_0\) is a row vector, as \(\lambda =0\). Hence this Plucker relation is enough to show that the rank of \(\mathfrak {F}_0\) is h.

  • By induction, at step \(t-1\), we assume the statement is true for the profile \((h-(t-1),(t-1)+2)\), i.e. rk\((\mathfrak {F}_{t-1})=\left( {\begin{array}{c}h\\ t\end{array}}\right) \). At step t we recall that rk\((\mathfrak {R_0}_t)=\) rk\((\mathfrak {F}_{t-1})=\left( {\begin{array}{c}h\\ t\end{array}}\right) ,\) so that

    $$\begin{aligned} rk(\mathfrak {F}_{t})=\left( {\begin{array}{c}h+1\\ t+1\end{array}}\right) -rk(\mathfrak {R_0}_t) =\left( {\begin{array}{c}h+1\\ t+1\end{array}}\right) -\left( {\begin{array}{c}h\\ t\end{array}}\right) =\left( {\begin{array}{c}h\\ t+1\end{array}}\right) . \end{aligned}$$

4 The variety of generalized fundamental matrices

In this section, we want to investigate the geometrical properties of the variety, \(\mathfrak {X} = \mathfrak {X_{(\alpha _1,\alpha _2)}}\), parameterizing the fundamental matrices corresponding to a given profile \((\alpha _1,\alpha _2)\) for two projections from \({\mathbb P}^{k}\) to \({\mathbb P}^{h_1}\) and \({\mathbb P}^{h_2}\) with \(k=\alpha _1 + \alpha _2 -1\), \(1 \le \alpha _1 \le h_1,\) \(1 \le \alpha _2 \le h_2\), \(h_1+h_2 \ge k+1,\) and \(\alpha _1 \ge \alpha _2.\) In particular, we prove the following Proposition:

Proposition 1

In the notation and setting above, \(\mathfrak {X}\) is an algebraic variety with:

$$\begin{aligned} dim(\mathfrak {X})= k(h_1+h_2)+h_1+h_2-k^2. \end{aligned}$$

Proof

The construction of this variety can be accomplished drawing inspiration from the setting of Aholt and Oeding in [7]. Let \(U_1\) and \(U_2\) be the vector spaces of dimension respectively \(h_1+1\) and \(h_2+1\) of the rows of A and of B, and let W be the vector space \(U_1 \oplus U_2\), of dimension \(h_1+h_2+2\). The matrix \(\left[ \begin{array}{c} A\\ B\\ \end{array} \right] \) can be thought of as a point in the Grassmann variety, \(G(k,{\mathbb P}(W)) \subset {\mathbb P}({\bigwedge }^{k+1}W)\).

As the projection matrices are defined up to independent projective transformations of the two views, one is naturally lead to consider the group \(S= SL(U_1) \times SL(U_2)\subset SL(W)\) consisting of the unit determinant blocks of dimensions \((h_1+1) \times (h_1+1)\) and \((h_2+1) \times (h_2+1)\) on the diagonal of a \((h_1+h_2+2) \times (h_1+h_2+2)\) matrix, and its action on the affine cone \(C(G(k,{\mathbb P}(W)))\) over \(G(k,{\mathbb P}(W)).\)

In the decomposition of \({\bigwedge }^{k+1}W\) as a \(S-\)module:

$$\begin{aligned} {\bigwedge }^{k+1}W = {\bigoplus }_{a=0}^{k+1}\left( {\bigwedge }^{a}U_1 \otimes {\bigwedge }^{k+1-a}U_2\right) , \end{aligned}$$
(4)

the summand corresponding to the minors of \(\left[ \begin{array}{c} A \\ B \\ \end{array} \right] \) appearing in the fundamental matrix \(\mathfrak {F}\) is \({\bigwedge }^{\alpha _1}U_1 \otimes {\bigwedge }^{\alpha _2}U_2.\)

Consider the projection \(\pi : {\mathbb P}({\bigwedge }^{k+1}(U_1 \oplus U_2)) \dashrightarrow {\mathbb P}({\bigwedge }^{\alpha _1}U_1 \otimes {\bigwedge }^{\alpha _2}U_2)\). Then

$$\begin{aligned} \mathfrak {X} = \overline{\pi (G(k,{\mathbb P}(W)))}. \end{aligned}$$

As in the classical case of projections from \({\mathbb P}^{3}\) to \({\mathbb P}^{2}\) [7, 10], we have that the projection \(\pi \) induces a birational map

$$\begin{aligned} \varPi :C(G(k,{\mathbb P}(W)))_{/(\mathbb {C}^*\bigoplus \mathbb {C}^*)} \dashrightarrow \mathfrak {X} \hookrightarrow {\mathbb P}({\bigwedge }^{k+1}(U_1 \oplus U_2)). \end{aligned}$$

Indeed Hartley and Shaffalitzky [2], prove that, but for the case \(h_1=h_2=1\), a Grassmann tensor completely determines the set of projection matrices up to projective transformations.

The birationality of the map \(\varPi \) allows us to compute the dimension of \(\mathfrak {X}\) via counting the number of essential parameters needed to determine A e B. This gives:

$$\begin{aligned} \dim (\mathfrak {X})= & {} ((h_1+1)(k+1)-1)+(h_2+1)(k+1)-1)-((k+1)^2-1)\nonumber \\= & {} k(h_1+h_2)+h_1+h_2-k^2. \end{aligned}$$

\(\square \)

Remark 4

Note that, as expected, \(\dim {\mathfrak {X}}\) is also the dimension of the quotient \(C(G(k,{\mathbb P}(W)))_{/(\mathbb {C}^*\bigoplus \mathbb {C}^*)}\).

4.1 A characterization of \(\mathfrak {X}\) with profile \((h_1,h_2)\)

In particular, in the case of profile \((\alpha _1,\alpha _2)=(h_1,h_2)\), under the usual assumptions of projections of maximal ranks and skew centers, we get the following:

Theorem 2

A matrix \(\mathfrak {G}\) of dimension \((h_1+1) \times (h_2+1)\) is the fundamental matrix for a suitable pair of projections from \({\mathbb P}^{h_1+h_2-1}\) to \({\mathbb P}^{h_1}\) and \({\mathbb P}^{h_2}\) with profile \((h_1, h_2)\) and skew centers if and only if \(rank(\mathfrak {G})=2.\)

Proof

Under the current assumptions, Theorem 1 gives \(\mathrm{rk}\,(\mathfrak {F}) = 2.\) On the other hand, the dimension of the variety of rank 2 matrices of dimension \((h_1+1)\times (h_2+1)\) is \(2(h_1+h_2)-1=dim(\mathfrak {X})\). \(\square \)

This generalizes the well-known result for the classical case, [10, Theorem 4.1]. Notice that, in general, one cannot expect to extend this result to other profiles. Indeed, in the previous section we have shown that \(dim(\mathfrak {X})= k(h_1+h_2)+h_1+h_2-k^2.\)

On the other hand, the dimension, \(\delta _r\), of the variety of the \((N_1+1) \times (N_2+1)\)-matrices of rank r is

$$\begin{aligned} \delta _r = (N_1+N_2+2)r-r^2-1. \end{aligned}$$

As \(N_1=\left( {\begin{array}{c}h_1+1\\ \alpha _1\end{array}}\right) -1\) and \(N_2=\left( {\begin{array}{c}h_2+1\\ \alpha _2\end{array}}\right) -1\), generically this dimension turns out to be greater than \(dim(\mathfrak {X})\). For example, if \(h_1=h_2=h,\) and \((\alpha _1,\alpha _2)=(h,h-1)\), which implies \(\mathrm{rk}\,(\mathfrak {F})=r=3,\) one has \(dim(\mathfrak {X})=6h-4\) and \(\delta _r=\frac{3}{2}h^2+\frac{9}{2}h-7\).

5 Application to reconstruction in computer vision

From the point of view of computer vision, the generalized fundamental matrix is relevant because of its application to reconstruction problems.

Given multiple images of an unknown scene, taken from unknown cameras, the goal of the reconstruction problem is to determine the positions of the cameras and of the scene points. Reconstruction problems can be posed in several geometric settings as metric, affine, or projective. As this work has been conducted entirely within the projective framework, reconstruction will always be assumed to be achieved up to projective transformations.

Within a projective setting, the center is the only property of a camera which is preserved under homographies of the view spaces (e.g. see [19]), and hence projective reconstruction of cameras consists only of the determination of their centers.

Therefore, given r views of a scene \(\{\mathbf {X_i}\}\subset {\mathbb P}^{k}\), the reconstruction problem has two stages: reconstructing the camera centers and reconstructing the scene, i.e. the position of the points \(\{\mathbf {X_i}\}\) in \({\mathbb P}^{k},\) once cameras have been reconstructed.

Assuming that the scene to be reconstructed consists of a large enough number of points, in general enough mutual positions, a first natural question is to determine the numerical conditions on \(k,h_j,\) for \(j=1 \ldots r,\) under which projective reconstruction is possible, both for the cameras Pj and for the scene \(\{\mathbf {X_i}\}\) in \({\mathbb P}^{k}.\)

Both these conditions are implicitly given in [2] in the general setting and they are explicitly highlighted in [20, Propositions 3.1, 3.2 and [21], Sect. 4.1] in the case \(h_1=\cdots =h_r\).

We now recall these conditions in our more general setting, and we sketch the geometric rationale for the statements.

  1. a)

    Let \({\mathbb P}^{h_j}\), \(j=1,\ldots ,r\) be r target views for projections from \({\mathbb P}^{k},\) in which a large enough set of corresponding subspaces, of dimensions \(s_j=h_j-\alpha _j\), are given. Then the centers of projections can be determined if and only if \(\sum _{j=1}^r \alpha _j \ge k+1.\) The projective reconstruction of the associated cameras (hence of the centers of projection) is possible only if requiring the existence of a point in the intersection of all rays projecting the given corresponding subspaces impose an actual non-trivial constraint on the position of the centers in \({\mathbb P}^{k}.\) In this case, knowing enough sets of corresponding subspaces, one can deduce the equations of the centers. For this to happen, the intersection of r general \((k-\alpha _i)\)-spaces, for \(i=1,\ldots r,\) must be empty. Grassmann formula shows that the dimension of the iterated intersection of such \((k-\alpha _i)\)-spaces drops by \(\alpha _i\) at each step. Therefore \(\sum _{i=1}^t \alpha _i \ge k+1.\)

  2. b)

    Let \({\mathbb P}^{h_j}\), \(j=1,\ldots ,r\) be r target views for projections from \({\mathbb P}^{k},\) in which a large enough set of corresponding subspaces, of dimensions \(s_j=h_j-\alpha _j\), are given. Assume camera centers are known. Then a scene \(\{\mathbf {X_i}\}\) can be reconstructed if and only if \(\sum _{i=1}^t \alpha _i \ge k.\) A scene point \(\mathbf {X}\) is determined by the intersection of a suitable number of corresponding rays. Hence the reconstruction of a point is possible if and only if the intersection of r generic \((k-\alpha _i)\)-spaces, for \(i=1,\ldots r,\) is at most one point. Using again Grassmann formula, we get \(\sum _{i=1}^r \alpha _i \ge k.\)

In our case with two views, under the assumption of Sect. 3, it is \(\alpha _1+\alpha _2 = k+1,\) and hence reconstruction of both cameras and scene points is indeed possible.

The role of the (generalized) fundamental matrix is better highlighted in the description of the algorithms involved in projective reconstruction. In the classical case of two projections from \({\mathbb P}^{3}\) to \({\mathbb P}^{2}, \) the reconstruction procedure follows the following steps:

  1. 1)

    Fundamental matrix from correspondences: If a sufficient number of corresponding points is known, one can determine the fundamental matrix F by solving the linear system \(\mathbf {x}^T\cdot F \cdot {\mathbf {x}}'=0.\) Note that, generically, 7 pairs of corresponding points are sufficient, see [4], and a finer analysis can be found in [17].

  2. 2)

    Projections from the fundamental matrix: Once F is determined, projection matrices can be reconstructed as in [4, Section 8.5.3].

  3. 3)

    Scene from projection matrices and correspondences: Once projection matrices are known, the scene points are determined intersecting corresponding rays.

In the following section, we describe the geometric-exact algorithmic procedures for performing the above steps.

5.1 Reconstruction algorithms

All algorithms presented below are described for the general case of views of different dimensions, but implemented, for the sake of simplicity, with \(h_1=h_2=h.\) Furthermore, as seen in the previous sections, different profiles \((\alpha _1,\alpha _2)\) could be chosen in the setup of projective reconstruction. Our implementation uses \(\alpha _1=h\) and \(\alpha _2=k-h+1\). \(\mathrm {MATLAB}^{\circledR }\)code is available from the authors upon request.

5.1.1 Generalized fundamental matrix from correspondences

Recall that, for the profile \((h_1,k-h_1+1)\), the generalized fundamental matrix is defined to be the matrix \(\mathfrak {F}\) such that \(\mathbf {x}^T\mathfrak {F}\mathbf {W} = 0,\) for any corresponding pair \((\mathbf {x},W),\) where \(\mathbf {x}\) is a point in the first view, W is a \((h_1+h_2-k-1)-\)space of the second view, and \(\mathbf {W}\) is the vector of Plücker coordinates of W. Assume a large enough set of pairs \(\{(\mathbf {x_i},W_i)\}\) of corresponding spaces in the two views are given. For each of them, one has a constraint \(\mathbf {x_i}^T\mathfrak {F}\mathbf {W_i}=0,\) where the entries \(f_{rs}\) of \(\mathfrak {F}\) are unknown. The set of these constraints gives a linear system whose solutions are entries of \(\mathfrak {F}\).

5.1.2 Projection matrices from the generalized fundamental matrix

Assume the generalized fundamental matrix \(\mathfrak {F}\) is given and one wants to recover a pair of projections AB corresponding to it. As it is standard in this context, everything being defined up to projective transformations, one can assume \(A = [I|O]\) where I is the identity matrix of order \(h_1+1\) and O denotes a \((h_1+1)\times (k-h_1)\) zero-matrix. Thus one has only to recover the matrix B.

First notice that, due to the results of Sect. 3, from \(\mathfrak {F}\) one can extract a set of generators for the epipole \(E_2\) in the second view. Indeed we have seen that a set of generators for the projective space \({\mathbb P}^{\omega }\) spanned by \(\varOmega _{(k-\alpha _1,h_2)}(E_2)\) is \(\{\mathbf {b}_1^T \mathfrak {F}, \ldots , \mathbf {b}_{h_1+1}^T \mathfrak {F} \},\) where \(\{\mathbf {b}_1, \ldots , \mathbf {b}_{h_1+1} \}\) denotes the standard basis for \({\mathbb P}^{h_1}.\) For the chosen profile, a dimension count shows that \(\varOmega _{(k-\alpha _1,h_2)}(E_2)= {\mathbb P}^{\omega }\), hence each \(\mathbf {b}_i^T \mathfrak {F}\) is the vector of the Plücker coordinates of a \((k-\alpha _1)\)-subspace of the second view containing \(E_2.\) From the Plücker coordinates of \(\mathbf {b}_i^T \mathfrak {F} = \mathbf {W}_i,\) one can then recover a set of generators for the corresponding linear space \(W_i,\) with standard techniques.

For any pair of corresponding point and space \((\mathbf {x_t}, W_s)\), the constraint \(\mathbf {x_t}^T\mathfrak {F}\mathbf {W_s} = 0\) can be written as \(det(M_{t,s})=0,\) where \(M_{t,s}=M(\mathbf {x_t},\mathbf {W_s})\) is a \((h_2+1)\times (h_2+1)-\) matrix whose columns \(M(1), \ldots ,M(h_2+1)\) are:

  • \(M(j)= \mathbf {e}_j,\) for \(j=1,\ldots ,k-h_1,\) where \(\mathbf {e}_1 \ldots \mathbf {e}_{k-h_1}\) are vectors spanning the epipole \(E_2;\)

  • \(M(k-h_1+1) = \mathbf {y},\) where \(\mathbf {y}= B(\mathbf {p})\) with \(\mathbf {p}= A^{+}(\mathbf {x_r}),\) and \(A \cdot A^{+}= I;\)

  • \(M(i) = \mathbf {z}_i\) for \(i=k-h_1+2,\ldots ,h_2+1,\) where \(\mathbf {z}_i\) are points of \({\mathbb P}^{h_2}\) spanning W.

Then, up to a constant, one has \(f_{lm}=\det (M_{l,m})=\mathbf {x_l}^T\mathfrak {F}\mathbf {W_m}\) where \(\mathbf {x_l}\) is the l-th element of the standard basis of \({\mathbb P}^{h_1}\) and \(\mathbf {W_m}\) is the m-th element of the standard basis of \({\mathbb P}^{N_2}.\) Hence, one can compute \(\det (M_{l,m})\) and all the elements \(f_{lm}\) of \(\mathfrak {F}\), by using generalized Laplace expansion, considering the last \(s_2+1\) columns as a block, so that the minors of this block represent the Plücker coordinates of \(W_m\).

Hence, for a given pair \((\mathbf {x}, W),\) all the columns of M are determined by \(\mathfrak {F}\) but the \((k-h_1+1)\)-th which depends linearly on the entries \(b_{uv}\) of B.

Letting \(\mathbf {x_l}\) vary among elements of the standard basis of \({\mathbb P}^{h}\) and \(\mathbf {W_m}\) among elements of a basis of the projective space \({\mathbb P}^{N_2},\) one gets a linear system \(f_{lm}= \det (M_{l,m})\) of \((h_1+1)\times (N_2+1)\) equations in \((h_2+1)\times (k+1)\) unknowns \(b_{uv}\), which solves the problem. Existence of solutions for the above system is guaranteed by the birationality of the map \(\varPi ,\) introduced in the proof of Proposition 1. Moreover, in case of an underdetermined system with infinitely many solutions, the birationality of \(\varPi \) also implies that all solutions are projectively equivalent.

5.1.3 Scene from projection matrices and correspondences

Once one has recovered a pair of projection matrices A and B, given a pair of corresponding spaces \(\{(\mathbf {x},W)\}\), it is possible to determine the scene point \(\mathbf {X}\) in \({\mathbb P}^{k}.\) More explicitly one simply solves the system (1) where \(M_{L,L'}\) has been populated with \(A, B, \mathbf {x}\) and a set of generators for W.