Abstract
In this paper, we consider eyes from the human binocular system, that simultaneously gaze on stationary point targets in space, while optimally skipping from one target to the next, by rotating their individual gaze directions. The head is assumed fixed on the torso and the rotating gaze directions of the two eyes are assumed restricted to pass through a point in the visual space. It is further assumed that, individually the rotations of the two eyes satisfy the well known Listing’s law. We formulate and study a combined optimal gaze rotation for the two eyes, by constructing a single Riemannian metric, on the associated parameter space. The goal is to optimally rotate so that the convergent gaze changes between two pre-specified target points in a finite time interval [0, 1]. The cost function we choose is the total energy, measured by the \(L^2\) norm, of the six external torques on the binocular system. The torque functions are synthesized by solving an associated ‘two-point boundary value problem’. The paper demonstrates, via simulation, the shape of the optimal gaze trajectory of the focused point of the binocular system. The Euclidean distance between the initial and the final point is compared to the arc-length of the optimal trajectory. The consumed energy, is computed for different eye movement chores and discussed in the paper. Via simulation we observe that certain eye movement maneuvers are energy efficient and demonstrate that the optimal external torque is a linear function in time. We also explore and conclude that splitting an arbitrary optimal eye movement into optimal vergence and version components is not energy efficient although this is how the human oculomotor control seems to operate. Optimal gaze trajectories and optimal external torque functions reported in this paper is new.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The class of problem that we consider in this paper is how the human eyes switch their focused points between two point targets in the visual space. Typically we assume that the eyes are on a stationary head, and rotate to inspect point targets that are located in 3D. The gaze directions are always constrained to pass through a point and the goal of the sensing mechanism is to initially start with a target fixed in its view and to switch to an alternate target in the visual field in a fixed interval of time, assumed to be [0, 1]. The paper analyses optimal rotation for a pair of human eyes as an optimal control problem over a fixed interval of time, extending two of our prior work [1, 2]. In our earlier research, binocular eye movement has been studied as a cascade of version and vergence eye rotation applied to the two eyes separately (see also [3, 4] and Sect. 6.2 of this paper).
Modeling the dynamics of monocular eye rotation has been an important goal in Neuroscience (see [5] for a short review article on how brain controls the eye movement) and Biomechanics [6]. Since the early half of the 19th century, scientists have tried to create dynamic models in order to understand various eye movement trajectories (see [7] for some historical details). The eyes rotate with three degrees of freedom [8, 9], rendering the eye movement system, a relatively simple mechanical control system compared to other complex human movement systems [10]. Starting from some of the initial papers of [11], the principles from geometry, for example as in [12,13,14], are central to many of the key questions in nonlinear systems theory (see [15]), applied to rotational dynamics. Specific to the eye movement control system, we would also like to refer to [16,17,18] and many references therein. For a single eye, optimization problems associated with gaze control [19], have been extended to optimal control problems studied by [20, 21] and [22].
Geometric methods are well known in the study of eye movement rotation (see [23,24,25] and [26]). Riemannian geometry (see [27, 28]) has been introduced for monocular control problems in [8]. The main contribution of this paper is to extend the Riemannian geometric formulation to binocular control problems. A configuration space for the binocular eye pair is described as a subset of \({\mathbf{SO}(3)} \times {\mathbf{SO}(3)}\). As described in our earlier papers [21], to every gaze direction of the eye there corresponds a circle of rotation matrices. This ambiguity is resolved by imposing a Listing’s Law on each of the two eyesFootnote 1. Additionally we assume that the gaze directions of the two eyes always meet at a point in the visual space \({\mathbb {R}}^3\), i.e., the eyes are always focused (see Fig. 1).
To end this section, we would like to make the point that binocular eye movement study was perhaps introduced by [30], not much later than the original eye movement studies of Listing in 1845 and Donders in 1848. In spite of that, literature on how the binocular eye is actuated, is rather sparse (see [31] and [32]). Human eyes, according to [30], are actuated ‘versionally’ towards a target followed by a ‘vergence’ focusing mechanism (see Fig. 2). In an actual eye movement maneuver, the order of vergence and version can repeat multiple times. The Donders’ constraint during the specific eye movements have also been studied in [32] and we would like to refer our readers to [4] (see also [33, 34] and [35] for other aspects of binocular eye movement studies). In a recent study, see [36], the role of torsional rotation of the human eye to match the two images of the retina has been introduced. Riemannian geometric approach to optimal binocular eye movement, studied in this paper, is new.
2 Notations and terminology
We start this section by introducing the axis-angle parametrization, see Fig. 3, of quaternions where the notations are borrowed from [20, 21] and [8]. Let us begin with the space of quaternions denoted by \({\varvec{Q}}\), see [37] and write each \(q \in {\varvec{Q}}\) as \(q_0{{\varvec{1}}} + q_1{{\varvec{i}}}+q_2{{\varvec{j}}}+q_3{{\varvec{k}}}\).
Space of unit quaternions will be identified with the unit sphere \(\varvec{S^3}\), and can be written as
where \(\phi \in [0,2\pi ]\) is an angle variable and \(n=(n_1,n_2,n_3)\) is a unit axis vector in \({\mathbb {R}}^3\). We denote by \(\mathbf{rot}\), the standard map from \(\boldsymbol {S^3}\) into \(\mathbf{ SO (3)}\) which maps the quaternion q to an orthogonal matrix that rotates a vector in \({\mathbb {R}}^3\) around the axis of rotation n by a counterclockwise angle \(\phi\) (see Fig. 3). It is easy to verify (see (3) in [21]) that
The orthogonal matrix (2) can be associated with the orientation of a rotating rigid body as follows: Each column of (2) is a mutually orthogonal unit vector. We can associate the three column vectors to three body coordinates that describe the orientation. The rotating rigid body, viz. the human eye, has a specific ‘gaze direction,’ a vector whose direction is what we propose to control. We use the convention that the gaze direction is given by the third column of the rotation matrix (2). We therefore have the following projection map, projecting the orientation matrix (2) to a gaze direction vector
where
Typically our interest is to control the gaze vector (3) so that it is pointing towards a suitable point target (see Fig. 4). As has been commented in Sect. 1, additional constraints on the quaternion q need to be imposed, so that the constrained orientation matrix with a specific gaze direction is unique. We would consider the pair of eyes to individually satisfy Listing’s constraint, given by \(q_3=0\). In the literature on binocular rotation, a general Donders’ constraint is also of interest, see [36] where torsional movements allow fusion of images from the two eyes, but will be considered in later papers.
3 Riemannian metric on human binocular system
We begin by considering parametrization of a point in \(\boldsymbol {S^3}\), as introduced in (1) and further describe the unit vector n, the axis of rotation, as
where \(\theta \in [0,\pi ]\) and \(\alpha \in [-\dfrac{\pi }{2},\dfrac{\pi }{2}].\) The parameterization (1), (4) together describes, what is known in [21], as the axis-angle parameterization of \(\boldsymbol {S^3}\) and \(\mathbf{ SO (3)}\) using the mapping ‘\(\mathbf{rot}\)’. In order for the orientation of a single eye to satisfy Listing’s constraint, \(q_3=0\), we impose \(\alpha =0\), forcing the axis of rotation to always lie on the plane \(z=0\). This reduces the quaternion in (1) to the form (see [8])
We now introduce two such quaternions, one for the left eye \(q^L\) and the other for the right eye \(q^R\), described as
and
The pair \((q^L, q^R)\) is thus an element of \(\boldsymbol {S^3} \times \boldsymbol {S^3}\). Note that the gaze directions corresponding to the left and the right eye are given by
Let us now assume that the left and the right eye has centers separated by the vector \(e=(x,y,z)=(0,1,0)\) as shown in Fig. 4. The figure shows that the centers of the left and the right eyes are located respectively at (0, 0, 0) and (0, 1, 0). The configuration space of the binocular system is now described by imposing that the vectors \(g_L, g_R\) and e are coplanar. Such a co-planarity condition will impose that the gaze directions of the left and the right eyes always meet at a point. Thus we have
We denote by \(\mathbf{LBIN}\) (L stands for Listing, BIN stands for Binocular), the subset of \(\mathbf{ SO (3)} \times \mathbf{ SO (3)}\) where the orientation matrices separately obey Listing’s Law and together the corresponding gaze directions satisfy the co-planarity condition (10). Equivalently, \(\mathbf{LBIN}\) is a subset of \(\mathbf{LIST} \times \mathbf{LIST}\) (see [8]) where the gaze directions of each component satisfy (10). Let \(\rho\) be the mapping
described as
where from (10) we have
Let us denote by \({\varvec{P}}\) the ‘yz-plane’ (also called the Transverse Plane), we remark that the right hand side function of (13) is well defined if \(g_L \not \in {\varvec{P}}\) and \(g_R \not \in {\varvec{P}}\), i.e., if the eyes are typically looking above or below, in the quadrant of positive or negative x-axis respectively. On the other hand, assuming that \(g_L \ne (0,0,1)\) and \(g_R \ne (0,0,1)\), then the right hand side function of (13) is not defined if \(g_L \in {\varvec{P}}\) and \(g_R \in {\varvec{P}}\). In Sect. 5 of this paper, we assume that the eyes are always looking in the quadrant corresponding to positive x-axis.
A Riemannian metric on \(\mathbf{LBIN}\) is easily induced from \(\mathbf{ SO (3)} \times \mathbf{ SO (3)}\). We define elements \(g_{ij}\) of the symmetric Riemannian matrix \(G_{LB}\) asFootnote 2
and computeFootnote 3 the Riemannian metric g given by
4 Euler-Lagrangian formulation of binocular eye movement
Since a Riemannian metric defines kinetic energy on the manifold, we use g in (15) to define the Lagrangian \({\mathcal {L}}=\dfrac{1}{2}~ g\) of the binocular systemFootnote 5. The controlled Euler-Lagrange equations are given by
where \(\mu \in \{\theta ^L,\phi ^L,\theta ^R\}\). It follows from [21] that (17) can be written as
where \(G_{LB}\) is the Riemannian matrix, \(\nabla _{\varTheta }\) is the gradient \(\Big [ \dfrac{\partial }{\partial {\theta ^L}} ~~\dfrac{\partial }{\partial {\phi ^L}}~~\dfrac{\partial }{\partial {\theta ^R}}\Big ]\), \(\varTheta = [\theta ^L~~\phi ^L~~\theta ^R]^{\mathrm{T}}\), and \(\tau\) is the vector of generalized torques \(\tau _\mu\). Further as [21] describes, we define the external torque vector T, in the inertial coordinate to be
andFootnote 6
Remark 1
The columns of the matrix M are called the Euler basis vectors, see [38], where T has been described as the resultant moment relative to the center of mass on the body.
Now we setup our dynamical system for the binocular eye rotation by defining
We require that the states go from some a priori agreed Z(0) to Z(1) while minimizing the control energy in a fixed interval of time
where T is the vector of external torques on the system given byFootnote 7
We denote the costate variables by
and define the Hamiltonian as
Using the Hamilton’s equations [13], the system
is now obtained. Using equations (18), (19), and (21), one can recast equation (25) as
where \(\varLambda _1 = [\lambda _1~~ \lambda _3~~ \lambda _5]^{\mathrm{T}}\), \(\varLambda _2 = [\lambda _2~~ \lambda _4~~ \lambda _6]^{\mathrm{T}}\), \(Z_1 = [z_1~~ z_3~~ z_5]^{\mathrm{T}}\), and \(Z_2 = [z_2~~ z_4~~ z_6]^{\mathrm{T}}\). Finally using the Pontryagin’s Maximum Principle, the expressions for optimal external torques (see [8]) are obtained:
which we write symbolically as
The control torques can now be eliminated from the state space system (26) and we obtain the following dynamical system
Since we know only the initial and the final value of Z, we have a two-point boundary value problem. The resulting problem is solved using COMSOL Multiphysics program (see [39]).Footnote 8 The computed Z and \(\varLambda\) variables are plugged in (29) to obtain the optimal vector T, which is denoted by \(T_{\mathrm{BVP}}\). In Sect. 5, the optimal \(T_{\mathrm{BVP}}\) is computed and plotted for ten different eye movement examples. In the following paragraph we describe how the ordinary differential equation (30) is implemented in COMSOL.
COMSOL is a finite-element based software that can be used to solve both ordinary and partial differential equations. It has a graphical-user-interface (GUI), where one can implement the modeling equations and corresponding initial and boundary conditions. In this paper, we use the COMSOL’s Coefficient form PDE module to implement the system of ODEs in (30). The generic PDE takes the form:
When implementing (30) as a boundary value problem, we set the coefficient matrices \(e_a = d_a = c = \alpha = \gamma = a = 0\) and the unknown vector \({\varvec{u}} = [Z ~~ \varLambda ]^{\mathrm{T}} = [z_1~~\cdots ~~ z_6~~ \lambda _1~~ \cdots ~~\lambda _6]^{\mathrm{T}}\). Further, we choose the parameters \(\beta = I_{12}\) and \(f = {\tilde{F}}[Z~~\varLambda ]\). Finally, the space variable in (31) is taken as time t, and hence, \(\nabla := \dfrac{\partial }{\partial t}\) in (31). In this problem, both boundary conditions correspond to the state variables \(z_1\ldots z_6\), and boundary conditions for the costate variables stay free at both end-points of the time interval. We impose Dirichlet boundary condition for the state variables at \(t=0\) using the first 6 equations and at \(t=1\) using the last 6 equations, and do not specify any boundary conditions for the costate variables.
5 Eye movement scenarios on LBIN explored by simulation
In this section, we present simulation results corresponding to three primary types of eye movements that arise from gazing stationary targets in space: Right-to-Left (RL), Near-to-Far (NF), and Top-to-Bottom (TB), and vice versa. Further, we focus on gazing near targets as well as distant targets. During our simulation, we consider combination of the primary eye movements along with the distance of the gaze. Using the coordinates of the given initial and final target positions, we calculate the corresponding values for the angle variables: \(\theta ^L, \, \phi ^L,\) and \(\theta ^R\) and use the computed angle variables as boundary conditions for the resulting two-point boundary value problem described using (30). The optimal trajectories for the generalized coordinates, velocities, torques, and the external torques are obtained over a unit-interval of timeFootnote 9.
The simulation results consist of solving the two-point boundary value problem for each eye movement scenario given in Figs. 5, 6, 7, 8, 9, 10, 11, 12, 13 and 14. In each figure, we illustrate the optimal variations of (a) generalized coordinates, (b) generalized velocities, (c) generalized torques, and (d) external torques. Further, in subplots (e)–(h), the change in the focused gaze point of the two eyes and its projections on coordinate planes are shown when the gaze switches from one point target to another. Figs. 5, 6, 7 show simulation results for gazing near targets and the remaining Figs. 8, 9, 10, 11, 12, 13 and 14 illustrate results for gazing distant targets.
In each of the examples below, the center of the two eyes are located at (0, 0, 0) and (0, 1, 0). The separation between the eyes is along the y-axis and is assumed to be of unit length (see Fig. 4).
Example 1
(SV-RL) The target points are at a ‘short view’ compared to the separation of the two eyes. Two eyes are gazing from right-to-left to move between targets that are located at the points (1, 1, 1) and (1, 0, 1). Hence, both eyes have to primarily rotate about the upward-pointing axis (x-axis) in the counterclockwise direction (see Fig. 5h). In Fig. 5d it is also observed that the torques \(T_x^L\) and \(T_x^R\) are applied about the x-axis. In Fig. 5f it is observed that the eyes rotate in the clockwise direction about the forward-pointing z-axis. Thus the torques \(T_z^L\) and \(T_z^R\) being applied about the z-axis in the clockwise direction are depicted in Fig. 5d with a sign opposite to that of \(T_x^L\) and \(T_x^R\).
Example 2
(SV-NF) The target points are at a ‘short view’ compared to the separation of the two eyes. Two eyes are gazing from near-to-far to move between targets that are located at the points (1, 0.5, 1) and (1, 0.5, 2). Hence, both eyes have to primarily rotate about the right-pointing axis (y-axis) in the clockwise direction (see Fig. 6g). In Fig. 6d it is also observed that the torques \(T_x^L\) and \(T_x^R\) are applied about the x-axis. In Fig. 6h it is observed that the eyes rotate in opposite direction about the upward-pointing x-axis. Hence \(T_x^L\) and \(T_x^R\) have opposite signs.
Example 3
(SV-TB) The target points are at a ‘short view’ compared to the separation of the two eyes. Two eyes are gazing from top-to-bottom to move between targets that are located at the points (2, 0.5, 1) and (1, 0.5, 1). Therefore, both eyes have to primarily rotate about the right-pointing axis (y-axis), and the direction of rotation about the y-axis should be in the clockwise direction for both eyes, and hence, the initial signs of the external torques, about the y-axis, are negative. This behavior can be observed in Fig. 7d as larger external torques are applied about the y-axis: \(T_y^L\) and \(T_y^R\). Interestingly, in contrast to Example 2, both eyes experience nonzero external torque values along the z-axis.
Example 4
(LV-RL) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from right-to-left to move between targets that are located at the points (3, 2, 4) and \((3,-2,4)\). As depicted in Fig. 8d and h, the primary axis of rotation is the upward-pointing x-axis, and hence, we observe larger external torques around it. In Fig. 8f, it is observed that the eyes rotate in the clockwise direction about the forward-pointing z-axis. Thus the torques \(T_z^L\) and \(T_z^R\) being applied about the z-axis in the clockwise direction is depicted in Fig. 8d with a sign opposite to that of \(T_x^L\) and \(T_x^R\).
Example 5
(LV-NF) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from near-to-far and moving between targets that are located at the points (3, 2, 4) and (3, 2, 8). Hence, both eyes are primarily rotating about the right-point y-axis in the clockwise direction (see Fig. 9g). In Fig. 9d, it is also observed that the torques \(T_x^L\) and \(T_x^R\) are applied about the x-axis. Using the locations of the two eyes and from Fig. 9h, it is observed that both eyes rotate in the counterclockwise direction about the upward-pointing x-axis.
Example 6
(LV-TB) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from top-to-bottom to move between targets that are located at the points (7, 2, 4) and (3, 2, 4). Therefore, again, both eyes have to primarily rotate about the right-pointing axis (y-axis), and as illustrated in Fig. 10g, the direction of rotation about the y-axis should be in the clockwise direction for both eyes, and hence, the initial sign of the external torques is negative. This behavior can be observed in Fig. 10d as larger external torques are applied about the y-axis: \(T_y^L\) and \(T_y^R\). It is also observed from Fig. 10f that the eyes rotate about the z-axis in the same direction (counterclockwise). Further, Fig. 10h indicates the presence of a rotation of two eyes about the upward-pointing x-axis.
Example 7
(LV-RLNF) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from right-to-left with near-to-far (RLNF) to move between targets that are located at the points (3, 2, 4) and \((3,-2,8)\). In this eye movement scenario, as shown in Fig. 11d, g, and h, both eyes primarily rotate about the upward-pointing axis (x-axis) as well as the right-pointing axis (y-axis). In the former axis, the eyes rotate in the counterclockwise direction and in the latter axis the rotation is about the clockwise direction. Moreover, the largest external torques are applied about the x-axis: \(T_x^L\) and \(T_x^R\). It is also observed from Fig. 11f that the eyes rotate about the z-axis in the clockwise direction yielding negative initial external torques \(T_z^L\) and \(T_z^R\).
Example 8
(LV-RLTB) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from right-to-left with top-to-bottom (RLTB) to move between targets that are located at the points (7, 2, 4) and \((3,-2,4)\). Similar to RLNF, both eyes primarily rotate about the upward-pointing axis (x-axis) as well as the right-pointing axis (y-axis). As illustrated in Fig. 12d, g, and h, the eyes rotate about the x-axis in the counterclockwise direction resulting positive initial torques: \(T_x^L\) and \(T_x^R\). Further, it is also observed from Fig. 12f that the eyes rotate about the z-axis in the clockwise direction yielding negative initial external torques \(T_z^L\) and \(T_z^R\).
Example 9
(LV-TBNF) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from top-to-bottom with near-to-far (TBNF) to move between targets that are located at the points (7, 2, 4) and (3, 2, 8). In this eye movement scenario, Fig. 12d and g indicate that both eyes primarily rotate about the right-pointing axis (y-axis) in the clockwise direction resulting in negative initial external torques: \(T_y^L\) and \(T_y^R\). Interestingly, Fig. 12f and h show counterclockwise rotations about the x- and z-axes.
Example 10
(LV-TBRLNF) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from top-to-bottom with right-to-left and near-to-far (TBRLNF) to move between targets that are located at the points (7, 2, 4) and \((3,-2,8)\). Figure 14d indicates that both eyes primarily rotate about the x- and y-axes and possess larger initial external torques about those two axes. Further, Fig. 14f, g, and h show a counterclockwise rotation about the x-axis and clockwise rotations about the remaining two axes for both eyes.
For each of Examples 1–10, in Table 1 we have entered the initial and the final points of the target, the Euclidean distance between the two points together with the arc length, the energy spent to make the optimal move and the corresponding energy spent per unit arc length. With these data, the different examples can now be compared as has been done in Sec. 6.
The computed values of the optimal external torque function (let us call this \(T_{\mathrm{BVP}}(t)\)), for each of the ten examples 1–10 have been plotted in Figs. 5d, 6d, 7d, 8d, 9d, 10d, 11d, 12d, 13d and 14d. In Table 2, we have noted down all the six components of \(T_{\mathrm{BVP}}(0)\), which is the initial value of the optimal external torque function. The table lists the initial torque vector for each of the ten examples.
6 Discussion
6.1 Discussion on the simulation results for Examples 1–10
Examples 1–3, in Figs. 5d, 6d, 7d, show that both the eyes experience the same magnitude in the corresponding external torque components, i.e., \(|T_x^L| = |T_x^R|,\, |T_y^L| = |T_y^R|,\) and \(|T_z^L| = |T_z^R|\). The cause for this property is due to the symmetry between the two eyes and the initial and final target positions. Examples 4–6, in Figs. 8, 9, 10 correspond to gazing distant targets and exhibit similar qualitative properties in the variation of the external torques as in the first three examples. However, we observe a slight difference in the corresponding external torques (\(T_{i}^L \ne T_{i}^R\) for \(i \in \{x,y,z\}\)) that arise due to the asymmetry in the configurations of the left and right eyes with respect to the initial and final target positions. We make the following observations:
-
Optimal path that includes ‘Moving Right to Left’ is not a straight line: In Examples 1, 4, 7, 8 and 10 wherein the initial and the final points involve movements in the ‘Right to Left direction’, the arc lengths are higher than the straight line distance between the initial and the final target positions (see Table 1). For side wise movement of the gaze, optimal arc is not straight.
-
Optimal ‘Near to Far and Top to Bottom Movements’ are along a straight line: In Examples 2, 3, 5, 6 and 9 wherein the initial and the final points do not involve movements in the ‘Right to Left direction,’ the arc lengths are close to the straight line distance between the initial and the final target positions (see Table 1). Said differently, optimal arcs in the xz-plane are significantly straight.
-
‘Optimal movements between targets at a short view are more energy expensive,’ compared to targets that are further away. If we compare energy spent per unit arc length, for Examples 1 and 4, 2 and 5, 3 and 6, using Table 1, we will notice the validity of this statement.
-
‘Right to Left movements are more expensive’ compared to Near to Far and Top to Bottom movements. This fact is evident by comparing Examples 1 with 2 and 3, and Examples 4 with 5 and 6 (see Table 1, energy spent per unit arc length.) When the primary eye movements are combined, this deficiency is not particularly visible (see Table 1, rows \(7-10\)).
-
In many of the examples, the components \(T_x\), \(T_y\) and \(T_z\) of the optimal torques for the left and the right eyes satisfy Herring’s LawFootnote 10. It is evident from Figs. 5d, 6d, 7d, 8d (Examples 1–4) that the \(T_x,\) \(T_y\) and \(T_z\) components of the external torque vector are the same time function, up to sign.
Regarding satisfaction of Herring’s Law, exception occurs in Example 5, see Fig. 11d, wherein the \(T_x\) components are substantially different between the left and the right eyes. In Example 6, see Fig. 10d, the \(T_z\) and the \(T_x\) components do not have precisely the same magnitude, whereas \(T_y\), the strongest component does have the same magnitude between the two eyes. Similar comments can be made for other examples. It is indeed quite interesting to note that in Example 10, see Fig. 14d, wherein the eye movement has all the three primary components, Herring’s law is satisfied. The three external torque components in the two eyes are almost identical. In Example 11, to be discussed in Sect. 6.2, the initial and the final points are chosen in such a way that the left eye does not have to move and only right eye movement is sufficient to switch the gaze from an initial to a final point. In this example, we show that optimal eye movement does not support Herring’s Law.
6.2 Herring’s and Helmholtz’ controversy
In order to decide whether, from the point of view of optimal control, it makes sense to assume that the left and the right eye are simultaneously innervated or the innervations to the two eyes are separate. In [34], a binocular eye movement experiment was conducted in which the initial point, the final point and the center of the left eye were in one line. The binocular gaze was shifted from the initial to the final point. In this experiment, the left eye does not need to move at all, whereas the right eye needs to rotate. If the eyes were innervated separately, only the right eye is required to move, as was proposed by Helmholtz. In the experiment, however, the eye movement was observed to be split into a cascade of vergence and version movements. In Example 11, the experiment from [34] was repeated from the point of view of optimal control. In Example 12, the vergence/version decomposition of the Example 11 has been described. In Example 13, the vergence/version decomposition of the Example 10 has been described. The details exhibited in Figs. 16 and 17 are described as follows.
Example 11
The gaze of the binocular eyes is rotating between points (1, 1, 1) and (3, 3, 3). The two eyes are centered as before at points (0, 0, 0) and (0, 1, 0). As shown in Fig. 15, the optimal gaze trajectory is a straight line wherein the left eye does not rotate at all and the right eye changes its gaze from the initial to the final point. Thus the optimal eye movement corroborates Helmholtz’s prediction of how the eyes are controlled, i.e., they are innervated separately.
Example 12
(vergence/version) We now split the eye movement in Example 11 into a cascade of ‘vergence’ followed by ‘version’, as indicated in [34]. It would follow that under vergence the gaze point moves between (1, 1, 1) and (3.28, 2.14, 3.28).Footnote 11 It is evident from Fig. 16d that the two eyes move in opposite directions during the vergence phase of the eye movement, indicated by the opposite signs of \(T_z^L\) and \(T_z^R\). We assume that the vergence movement is completed in the time interval \([0,\dfrac{1}{2}]\). Next we consider version during the interval \([\dfrac{1}{2},1]\) and the gaze point moves between (3.28, 2.14, 3.28) and (3, 3, 3). It is evident from Fig. 16 that this time the two eyes move in the same direction (indicated by the same signs of \(T_z^L\) and \(T_z^R\)).
Example 13
(vergence/version) In this example, we consider the vergence/version decomposition of the eye movement chore from Example 10. The target points in Example 10 were at (7, 2, 4) and \((3,-2,8)\). The intermediate point F is computed as (7.6, 2.14, 4.34) using the procedure outlined in Example 12. The vergence and the version movements are sketched in Fig. 17, along with the external torques (see Table 3 as well). It is evident from Fig. 17e that in this example, the vergence movement is small compared to version. The magnitude of the external torque vector for vergence is considerably small (see Fig. 17d) compared to version. The actual numbers are in the last row of Table 3. The optimal control is computed by splitting the total time range [0, 1] into two equal intervals \(\left[ 0,\dfrac{1}{2}\right]\) and \(\left[ \dfrac{1}{2},1\right]\). We would like to note that perhaps a different choice such as \([0,\mu ]\), \([\mu ,1]\) for small value of \(\mu\) would reduce the total energy requirementFootnote 12.
Remark 2
When an optimal gaze movement is split up between vergence and version, the expended energy rises rapidly. This is evident from Examples 12 and 13, wherein we observe that by and large, the version movements are energy inefficient.
6.3 The structure of the optimal controller
Recall that for eyes centered on the y-axis and for rotation matrices in \(\mathbf{LBIN}\) (as introduced in Sect. 3), the optimal external torque function obtained in (29) has an interesting linear structure. For each of the Examples 1–11, the optimal external torque function (see Figs. 5d, 6d, 7d, 8d, 9d, 10d, 11d, 12d, 13d, 14d, and 15d), is of the form
where \(t \in [0,1]\). The acronym ‘BVP’ stands for Boundary Value Problem, and \(T_{\mathrm{BVP}}(t)\) is the optimal external torque controller obtained by solving the ‘BVP’ outlined in (30) Footnote 13.
Note that in (29) we had
where
is a \(3 \times 6\) matrix and (33) indicates that \(T_{\mathrm{BVP}}(t)\) is in the row span of W(t), for \(t \in [0,1]\). If \(T_{\mathrm{BVP}}(t)\) is indeed of the form (32), it would follow that there exist one vector \(T_{\mathrm{BVP}}(0)\) that is contained in the row span of W(t), \(t \in [0,1]\), i.e.,
We now state and prove the following proposition.
Proposition 1
Assume that \(T_{\mathrm{BVP}}(t)\) satisfies (32), if \(t_1\) and \(t_2\) be such that \(t_1 \ne t_2\) and \(t_1, t_2 \in [0,1]\) and we define a \(6 \times 6\) matrix
it follows that \({\mathrm{rank}} ~ L ~ < ~ 6\).
Proof of Proposition 1
From (32) and (33) or equivalently from (34) it would follow that there exists a nonzero vector \(T_{\mathrm{BVP}}(0)\) contained in row span of \(W(t_1)\) and row span of \(W(t_2)\). It follows that there exists nonzero vectors \((a_1, b_1, c_1)\) and \((a_2, b_2, c_2)\) such that
Hence
\(\square\)
Remark 3
For each of Examples 1–10, we have computed in Table 4 the singular values of the matrix L (see (35)) choosing \(t_1=0\) and \(t_2= 0.25, 0.50, 0.75, 1.00\). Each row in the table lists out the 6 singular values. There are 4 pairs of \((t_1, t_2)\) for each example contributing to 4 rows. Ideally if (32) is perfectly satisfied, then the smallest singular value should be zero. This would imply that the last column of Table 4 should ideally be zero.
Remark 4
In Table 2, the vector \(T_{\mathrm{BVP}}(0)\) is listed for each of the ten examples. The equation (32) indicates that these vectors completely determine the optimal external torques function. To verify that the control input \(T_{\mathrm{BVP}}(t)\) to the dynamical system (18), (19), we solve the corresponding initial value problem. In Fig. 18, we display the generalized coordinates from the initial value problem and the boundary value problem from Example 10 (see Fig. 14a), showing that the co-ordinates match perfectly. The matching has been performed for all the other examples as well, but the results are not displayed.
Consider an optimal control problem where the goal is to move from a point C to an arbitrary point \(D_i\) (see Fig. 19). If we assume that the initial point C is fixed and the terminal point \(D_i\) is arbitrary we consider the the following map:
where
In (36) the notation \(\mathbf {Gr}(3,6)\) refers to the Grassmannian manifold [40] of homogeneous 3-planes in \(\mathbb {R}^6\).
The following proposition pertains to the optimal external torque function, which transfers the gaze optimally from C to \(D_i\) and is denoted by \(T_{CD_i}\). Let us denote this function by \(T_{CD_i}(t)\) Footnote 14. We now state the following proposition.
Proposition 2
Let us assume that the optimal external torque function \(T_{CD_i}(t)\) satisfy (32) for every \(D_i\) in \(\mathbb {R}^3\), it follows that
for every \(t \in [0,1]\).
Proof of Proposition 2
It would follow from (33) thatFootnote 15
This would imply that \(T_{CD_i}(0) \in \chi (C)\). Using the fact that the external torque function \(T_{CD_i}(t)\) satisfy (32), it would follow that \(T_{CD_i}(t) \in \chi (C)\). \(\square\)
To end this section we would like to make the following remark.
Remark 5
The image of the map \(\chi\) in (36) can be described as follows
7 Conclusions
Using Riemannian formulation proposed in this paper, our main goal is to propose and solve a binocular optimal control problem, wherein a pair of visual sensors rotate to focus attention from one target to another. The associated dynamical system (26) is written out using the Euler-Lagrange equations as a control system with ‘External Torque’ as input. The goal is to minimize a suitable quadratic control energy function while the binocular system changes focus from one target to another. Various eye movement chores were simulated in Sect. 5 and their optimal gaze trajectories were computed. In Examples 12 and 13, the optimal gaze trajectory is split up into an optimal vergence and version movement. We observe that the ‘vergence/version’ movements are not energy efficient and note this fact in Table 4. In this table, it is noted that energy requirement for two consecutive transfers, vergence followed by version, is significantly higher than the optimal energy requirement for a direct transfer. We conclude this paper by emphasizing two particularly interesting observations that we were able to demonstrate in our simulations in Sect. 5. The first one is that ‘Eye movement chores side to side, i.e., from right to left and back are energy expensive,’ as opposed to eye movements along the xz-plane (see Table 1, where we observe that in Examples 1 and 4 energy per unit arc length numbers are higher compared to the corresponding numbers for Examples 2, 3, 5 and 6.). The second one is that ‘The optimal external torque function is linear in time’ (see Sect. 6.3). Somewhat surprisingly, we observe that the ‘initial value of the external torque vector completely determines the entire function in the interval [0, 1]’ and the choice of this vector is determined by the initial and the final position of the gaze point in \(\mathbb {R}^3\). Moreover the initial value of the external torque vector lies in a fixed 3-plane in \(\mathbb {R}^6\) determined by the initial value of the gaze. Changing the initial gaze point alters the 3-plane in \(\mathbb {R}^6\) as a point in the Grassmannian manifold \(\mathbf {Gr}(3,6)\) Footnote 16 (see [40]).
We do not have a specific explanation for the two simulation based observations described here. However, they do have consequences in terms of how visual exploration could be carried out for the binocular system of eyes. If there is a set of n visual targets in space, for some positive integer n, one would like to design a search scheme to sequentially explore the targets. A smart strategy would be avoid, as much as possible, Right to Left and Left to Right movements, because these are energy inefficient. Finally, the linearity of the external optimal torque function has computational consequences. To compute these functions, it is enough to compute \(T_{BVP}(0)\), the optimal torque vector at the initial time \(t=0\). In principal one can compute these vectors over a discrete set of target points and use interpolation for other intermediate values. This simplifies the optimal controller computation as opposed to solving the boundary value problem for every set of target points.
As a final remark we note that optimal binocular eye movement is an important problem to study in order to design machines that can visually explore a terrain optimally. A limitation of the proposed research is that - optimal controllers we design are typically implemented in an open loop and the computation requires solving a two point boundary value problem, using COMSOL that may not be readily available. Such controllers are perhaps not robust and keeping the final target point stable could become an issue. As a possible subject for future work, we propose to compare optimal controllers synthesized in this paper with other stabilizing, but possibly non-optimal controllers.
Notes
We define \(\langle \cdot \rangle\) to be the standard Euclidean inner product on the product space \({\mathbb {R}}^4 \times {\mathbb {R}}^4\).
The details of this computation is omitted, see [8].
G is the corresponding Riemannian matrix of inner products, see [8].
We assume that the potential energy function \({\varvec{V}}\) is zero.
For a detailed computation of the M matrix and proof of the statement (20), we refer to the appendix.
The superscript L and R are for left eye and right eye, respectively. The subscripts refer to the x-, y- and z-axes.
Herring’s law states that a movement of one eye is accompanied by an approximately equal movement of the other eye, either in the same direction (version) or in the opposite direction (vergence), see [34].
We now indicate how the point \(F=(3.28,2.14,3.28)\) is computed. Let E be the midpoint between the center of the two eyes and let us denote by C and D the points (1, 1, 1) and (3, 3, 3) respectively (see Fig. 2). We choose F in such a way that F, C and E are in a straight line. Moreover, we assume that the distance F and E equals the distance between D and E. Among two possible coordinate choices of F, we choose the one closest to D.
\(T_{CD_i}(t)\) is the external torque function computed by solving the Boundary Value Problem on the system (30).
Note that although the \(\lambda =[\lambda _2(t) ~ \lambda _4(t) ~ \lambda _6(t)]\) vector depends on both C and \(D_i\), the matrix W(0) depends only on C.
\(\mathbf {Gr}(3,6)\) is the Grassmannian manifold of 3 planes in \(\mathbb {R}^6\).
This proposition is already known to be true on \(\mathbf{SO (3)}\) and \(\mathbf{LIST}\) as has been already reported in [21].
References
Rajamuni, M. M., Aulisa, E., & Ghosh, Bijoy K. (2014). Optimal control problems in binocular vision. IFAC Proceedings Volumes, 47(3), 5283–5289.
Wijayasinghe, Indika B., & Ghosh, Bijoy K. (2013). Binocular eye tracking control satisfying Hering's law. In Proceedings of the 52nd IEEE Conference on Decision and Control (pp. 6475–6480). Florence, Italy.
Oki, Takafumi, & Ghosh, Bijoy K. (2015). Stabilization and trajectory tracking of version and vergence eye movements in human binocular control. In Proceedings of the European Control Conference (ECC) (pp. 1573–1580). Linz, Austria.
Ruths, Justin, Ghosh, Supratim, & Ghosh, Bijoy K. (2016). Optimal tracking of version and vergence eye movements in human binocular control. In Proceedings of the European Control Conference (ECC) (pp. 2410–2415). Aalborg, Denmark.
Angelaki, D. E., & Hess, B. J. M. (2004). Control of eye orientation: where does the brains role end and the muscle's begin. European Journal of Neuroscience, 19(1), 1–10. https://doi.org/10.1111/j.1460-9568.2004.03068.x.
Robinson, D. A. (1981). The use of control systems analysis in the neurophysiology of eye movements. Annual Review of Neuroscience, 4, 463–503. https://doi.org/10.1146/annurev.ne.04.030181.002335.
Robinson, D. A. (1964). The mechanics of human saccadic eye movement. Journal of Physiology, 174(2), 245–264.
Polpitiya, Ashoka D., Dayawansa, Wijesuriya P., Martin, Clyde F., & Ghosh, Bijoy K. (2007). Geometry and control of human eye movements. IEEE Transactions on Automatic Control, 52(2), 170–180.
Raphan, T. (1998). Modeling control of eye orientation in three dimension I, role of muscle pulleys in determining saccadic trajectory. Journal of Neurophysiology, 79(5), 2653–2667.
Nielsen, J. B. (2003). How we walk: central control of muscle activity during human walking. Neuroscience, 9(3), 195–204.
Brockett, R. W. (1973). Lie theory and control systems defined on spheres. Siam Journal of Applied Mathematics, 25(2), 213–225.
Abraham, R., & Marsden, J. E. (1987). Foundations of Mechanics (2nd ed.). Boston: Addison-Wesley.
Arnol’d, V. I. (1989). Mathematical Methods of Classical Mechanics (2nd ed., Vol. 60). New York: Springer Verlag.
Smale, Steve. (1970). Topology and mechanics, I. Inventiones Mathematicae, 10(4), 305–331.
Bullo, F., Murray, R. M., & Sarti, S. (1995). Control on the sphere and reduced attitude stabilization. IFAC Proceedings Volumes, 28(14), 495–501.
Martin, C., & Schovanec, L. (1998). Muscle mechanics and dynamics of ocular motion. Journal of Mathematical Systems, 8(2), 1–15.
Miller, J., & Robinson, D. (1984). A model of the mechanics of binocular alignment. Journal of Mathematical Systems Estimation and Control, 17(5), 436–470.
Quaia, C., & Optican, L. (1998). Commutative saccadic generator is sufficient ot control a 3D ocular plant with pulleys. Journal of Neurophysiology, 79(6), 3197–3215.
Douglas, Tweed, Haslwanter, T., & Fetter, M. (1998). Optimizing gaze control in three dimensions. Science, 281(5381), 1363–1366.
Ghosh, B. K., & Wijayasinghe, I. B. (2012). Dynamics of human head and eye rotations under Donders’ constraint. IEEE Transactions on Automatic Control, 57(10), 2478–2489.
Ghosh, B. K., Wijayasinghe, I. B., & Kahagalage, S. D. (2014). A geometric approach to head/eye control. IEEE Access, 2, 316–332. https://doi.org/10.1109/ACCESS.2014.2315523.
Wijayasinghe, I. B., Ruths, J., Büttner, U., Ghosh, B. K., Glasauer, S., Kremmyda, O., et al. (2014). Potential and optimal control of human head movement using Tait-Bryan parametrization. Automatica, 50(2), 519–529. https://doi.org/10.1016/j.automatica.2013.11.017.
Handzel, A. A., Flash, T. (1995). The geometry of eye rotations and Listing’s law. In Proceedings of Conference on Advances in Neural Information Processing Systems (pp. 117–123). Denver, CO.
Haslwanter, T. (1995). Mathematics of three dimensional eye rotations. Vision Research, 35(12), 1727–1739.
Hepp, K. (1990). On Listing’s law. Communications in Mathematical Physics, 132(1), 285–292.
Opstal, J. V. (1988). Three Dimensional Kinematics Underlying Gaze Control. New York: Springer Verlag.
Boothby, W. M. (2003). An Introduction to Differentiable Manifolds and Riemannian Geometry. Houston: Gulf Professional Publishing.
do Carmo, M. P. (1993). Riemannian Geometry. Boston: Birkhäuser.
Listing. J. B. (1845). Beiträge zur physiologischen Optik. Göttinger Studien, Vandenhoeck und Ruprecht, Göttingen.
Hering, E. (1868). The Theory of Binocular Vision. New York: Plenum Press.
Nakayama, K., Ciuffreda, K., & Schor, C. (1983). Kinematics of normal and strabismic eyes. In Basic and Clinical Aspects of Binocular Vergence Movements (pp. 544–564). Butterworths.
van Rijn, L. J., & van den Berg, A. V. (1993). Binocular eye orientation during fixations: Listing’s law extended to include eye vergence. Vision Research, 33(5–6), 691–708.
Collewijn, H., Erkelens, C. J., & Steinman, R. M. (1989). Ocular vergence under natural conditions, II: Gaze shifts between real targets differing in distance and direction. Proceedings of the Royal Society Series B – Biological Sciences, 236(1285), 441–465.
Ono, H., & Nakamizo, S. (1978). Changing fixation in the transverse plane at eye level and Hering’s law of equal innervation. Vision Research, 18(5), 511–519.
Ono, H., Nakamizo, S., & Steinbach, M. J. (1978). Nonadditivity of vergence and saccadic eye movement. Vision Research, 18(6), 735–739.
Hess, B. J. M. (2018). On the role of ocular torsion in binocular visual matching. Scientific Reports, 8(1), 10666. https://doi.org/10.1038/s41598-018-28513-8.
Kuipers, J. (1998). Quaternions and Rotation Sequences. Princeton: Princeton University Press.
O’Reilly, O. M. (2007). The dual Euler basis: constraints, potentials, and Lagrange’s equations in rigid body dynamics. ASME Journal of Applied Mechanics, 74(2), 1–10.
Zimmerman, W. B. J. (2006). Multiphysics Modeling with Finite Element Methods. River Edge, NJ: World Scientific.
Milnor, John W., & Stasheff, James D. (1974). Characteristic Classes. Princeton: Princeton University Press and University of Tokyo Press.
Acknowledgements
Part of the work on this paper was possible while the first author was visiting the School of Automation Engg., UESTC at Chengdu, China as a Professor. This paper was also supported in part by Dick and Martha Professorship at Texas Tech University, Lubbock, U.S.A.
Author information
Authors and Affiliations
Corresponding author
Appendix: Riemannian matrix calculations on LBIN
Appendix: Riemannian matrix calculations on LBIN
Using the axis-angle parameterization from Sect. 2, we now show how to construct the M matrix in (19). Let us begin by referring to [20], page 2488 that, for the Listing’s manifold, the angular and the generalized velocity vectors, with respect to a head fixed inertial frame, are related by the following
For the Binocular system of eyes, where the two eyes separately satisfy the Listing’s constraint (see [29]), we can augment (37) and write
where the superscripts L and R refer to the left and the right eye, respectively. Writing the co-planarity condition (13) as
one can express
and hence, we can rewrite the transformation matrix in (38) as a \(6 \times 3\) matrix M, for \(\mathbf{LBIN}\) as follows:
The subscript LB stands for Listing and Binocular. The operator \(\nabla\) is defined as
The matrix \(M_{LB}^T\) is precisely the transformation between the vector T of ‘external torque to the two eyes’ to the generalized torque vector \(\tau\) as outlined in (19) (see also [38]). We are now ready to prove the following proposition:
Proposition 3
Let \(G_{LB}\) be the Riemannian matrix on \(\mathbf{LBIN}\) and let \(M_{LB}\) be the matrix defined in (40), and it follows that equation (20) is satisfiedFootnote 17.
Proof of Proposition 3
Explicitly writing the matrix G from (16), we obtain
Writing \(M_{LB}^TM_{LB}\) from (40), we obtain (20). \(\square\)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ghosh, B.K., Athukorallage, B. Minimum energy optimal external torque control of human binocular vision. Control Theory Technol. 18, 431–458 (2020). https://doi.org/10.1007/s11768-020-00015-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11768-020-00015-x