1 Introduction

The class of problem that we consider in this paper is how the human eyes switch their focused points between two point targets in the visual space. Typically we assume that the eyes are on a stationary head, and rotate to inspect point targets that are located in 3D. The gaze directions are always constrained to pass through a point and the goal of the sensing mechanism is to initially start with a target fixed in its view and to switch to an alternate target in the visual field in a fixed interval of time, assumed to be [0, 1]. The paper analyses optimal rotation for a pair of human eyes as an optimal control problem over a fixed interval of time, extending two of our prior work [1, 2]. In our earlier research, binocular eye movement has been studied as a cascade of version and vergence eye rotation applied to the two eyes separately (see also [3, 4] and Sect. 6.2 of this paper).

Modeling the dynamics of monocular eye rotation has been an important goal in Neuroscience (see [5] for a short review article on how brain controls the eye movement) and Biomechanics [6]. Since the early half of the 19th century, scientists have tried to create dynamic models in order to understand various eye movement trajectories (see [7] for some historical details). The eyes rotate with three degrees of freedom [8, 9], rendering the eye movement system, a relatively simple mechanical control system compared to other complex human movement systems [10]. Starting from some of the initial papers of [11], the principles from geometry, for example as in [12,13,14], are central to many of the key questions in nonlinear systems theory (see [15]), applied to rotational dynamics. Specific to the eye movement control system, we would also like to refer to [16,17,18] and many references therein. For a single eye, optimization problems associated with gaze control [19], have been extended to optimal control problems studied by [20, 21] and [22].

Geometric methods are well known in the study of eye movement rotation (see [23,24,25] and [26]). Riemannian geometry (see [27, 28]) has been introduced for monocular control problems in [8]. The main contribution of this paper is to extend the Riemannian geometric formulation to binocular control problems. A configuration space for the binocular eye pair is described as a subset of \({\mathbf{SO}(3)} \times {\mathbf{SO}(3)}\). As described in our earlier papers [21], to every gaze direction of the eye there corresponds a circle of rotation matrices. This ambiguity is resolved by imposing a Listing’s Law on each of the two eyesFootnote 1. Additionally we assume that the gaze directions of the two eyes always meet at a point in the visual space \({\mathbb {R}}^3\), i.e., the eyes are always focused (see Fig. 1).

Fig. 1
figure 1

Human binocular system with two eyes fixated on a target point

To end this section, we would like to make the point that binocular eye movement study was perhaps introduced by [30], not much later than the original eye movement studies of Listing in 1845 and Donders in 1848. In spite of that, literature on how the binocular eye is actuated, is rather sparse (see [31] and [32]). Human eyes, according to [30], are actuated ‘versionally’ towards a target followed by a ‘vergence’ focusing mechanism (see Fig. 2). In an actual eye movement maneuver, the order of vergence and version can repeat multiple times. The Donders’ constraint during the specific eye movements have also been studied in [32] and we would like to refer our readers to [4] (see also [33, 34] and [35] for other aspects of binocular eye movement studies). In a recent study, see [36], the role of torsional rotation of the human eye to match the two images of the retina has been introduced. Riemannian geometric approach to optimal binocular eye movement, studied in this paper, is new.

Fig. 2
figure 2

The ‘vergence’ and ‘version’ eye movements. The point E is the mid-point between two eyes. Initial and final gaze points are C and D, respectively. C is an arbitrary point in \(\mathbb {R}^3\). D is chosen in such a way that the center of the left eye, C and D are collinear. Point F is the ‘end-point of the vergence’ and ‘initial point of the version’ eye movements. The distance between F and E is chosen to equal the distance between D and E. Among the two possible choices of F, the one chosen is closest from D

Fig. 3
figure 3

Axis-angle parametrization under Listing’s constraint \(q_3=0\). The Listing’s plane is described by \(z=0\). Angle \(\theta\) is the angle subtended between the positive axis of rotation on the Listing’s plane and the positive x-axis. Angle \(\phi\) is the anticlockwise rotation about the positive axis of rotation

2 Notations and terminology

We start this section by introducing the axis-angle parametrization, see Fig. 3, of quaternions where the notations are borrowed from [20, 21] and [8]. Let us begin with the space of quaternions denoted by \({\varvec{Q}}\), see [37] and write each \(q \in {\varvec{Q}}\) as \(q_0{{\varvec{1}}} + q_1{{\varvec{i}}}+q_2{{\varvec{j}}}+q_3{{\varvec{k}}}\).

Space of unit quaternions will be identified with the unit sphere \(\varvec{S^3}\), and can be written as

$$\begin{aligned} q= \displaystyle {\cos \Big (\dfrac{\phi }{2}\Big ){{\varvec{1}}} + \sin \Big (\dfrac{\phi }{2}\Big )n_1{{\varvec{i}}} + \sin \Big (\dfrac{\phi }{2}\Big )n_2{{\varvec{j}}} + \sin \Big (\dfrac{\phi }{2}\Big )n_3{{\varvec{k}}}}, \end{aligned}$$
(1)

where \(\phi \in [0,2\pi ]\) is an angle variable and \(n=(n_1,n_2,n_3)\) is a unit axis vector in \({\mathbb {R}}^3\). We denote by \(\mathbf{rot}\), the standard map from \(\boldsymbol {S^3}\) into \(\mathbf{ SO (3)}\) which maps the quaternion q to an orthogonal matrix that rotates a vector in \({\mathbb {R}}^3\) around the axis of rotation n by a counterclockwise angle \(\phi\) (see Fig. 3). It is easy to verify (see (3) in [21]) that

$$\begin{aligned} \mathbf{rot} (q)= \left[ \begin{array}{ccc} q_0^2+q_1^2-q_2^2 - q_3^2 &{} 2(q_1q_2-q_0q_3) &{} 2(q_1q_3+q_0q_2)\\ 2(q_1q_2+q_0q_3) &{} q_0^2+q_2^2-q_1^2 - q_3^2 &{} 2(q_2q_3-q_0q_1)\\ 2(q_1q_3-q_0q_2) &{} 2(q_2q_3+q_0q_1) &{} q_0^2+q_3^2-q_1^2 - q_2^2 \end{array}\right] . \end{aligned}$$
(2)

The orthogonal matrix (2) can be associated with the orientation of a rotating rigid body as follows: Each column of (2) is a mutually orthogonal unit vector. We can associate the three column vectors to three body coordinates that describe the orientation. The rotating rigid body, viz. the human eye, has a specific ‘gaze direction,’ a vector whose direction is what we propose to control. We use the convention that the gaze direction is given by the third column of the rotation matrix (2). We therefore have the following projection map, projecting the orientation matrix (2) to a gaze direction vector

$$\begin{aligned} \mathbf{proj}: \mathbf{ SO (3)} \rightarrow \boldsymbol{S^2}, \end{aligned}$$

where

$$\begin{aligned} \mathbf{rot (\boldsymbol{q})}\longmapsto \left[ \begin{array}{c} 2(q_1q_3+q_0q_2)\\ 2(q_2q_3-q_0q_1)\\ q_0^2+q_3^2-q_1^2-q_2^2 \end{array} \right] . \end{aligned}$$
(3)

Typically our interest is to control the gaze vector (3) so that it is pointing towards a suitable point target (see Fig. 4). As has been commented in Sect. 1, additional constraints on the quaternion q need to be imposed, so that the constrained orientation matrix with a specific gaze direction is unique. We would consider the pair of eyes to individually satisfy Listing’s constraint, given by \(q_3=0\). In the literature on binocular rotation, a general Donders’ constraint is also of interest, see [36] where torsional movements allow fusion of images from the two eyes, but will be considered in later papers.

Fig. 4
figure 4

Configuration space of the binocular system. The centers of the left and the right eyes are located respectively at (0, 0, 0) and (0, 1, 0). Note that the center of the axes is assumed to coincide with the left eye, upward-pointing axis is the positive x-axis while the forward-pointing axis is the positive z-axis. The y-axis joins the center of the two eyes and the direction from left to right is chosen positive. Curly arrows (in blue) indicate the positive (clockwise, viewed from the axis center) rotation about each axis. The blue dot is the target, and the two red arrows are the eye-gaze directions

3 Riemannian metric on human binocular system

We begin by considering parametrization of a point in \(\boldsymbol {S^3}\), as introduced in (1) and further describe the unit vector n, the axis of rotation, as

$$\begin{aligned} n=[\cos \theta \cos \alpha ~~ \sin \theta \cos \alpha ~~ \sin \alpha ], \end{aligned}$$
(4)

where \(\theta \in [0,\pi ]\) and \(\alpha \in [-\dfrac{\pi }{2},\dfrac{\pi }{2}].\) The parameterization (1), (4) together describes, what is known in [21], as the axis-angle parameterization of \(\boldsymbol {S^3}\) and \(\mathbf{ SO (3)}\) using the mapping ‘\(\mathbf{rot}\)’. In order for the orientation of a single eye to satisfy Listing’s constraint, \(q_3=0\), we impose \(\alpha =0\), forcing the axis of rotation to always lie on the plane \(z=0\). This reduces the quaternion in (1) to the form (see [8])

$$\begin{aligned} q=\cos \dfrac{\phi }{2} {\varvec{1}}+\sin \dfrac{\phi }{2} \left[ \cos \theta \,{\varvec{i}}+\sin \theta \, {\varvec{j}} \right] + 0 \,{\varvec{k}}. \end{aligned}$$
(5)

We now introduce two such quaternions, one for the left eye \(q^L\) and the other for the right eye \(q^R\), described as

$$\begin{aligned} q^L=\cos \dfrac{\phi ^L}{2} {{\varvec{1}}}+\sin \dfrac{\phi ^L}{2} \left[ \cos \theta ^L {{\varvec{i}}} +\sin \theta ^L {{\varvec{j}}} \right] + 0 \,{\varvec{k}} \end{aligned}$$
(6)

and

$$\begin{aligned} q^R=\cos \dfrac{\phi ^R}{2} {{\varvec{1}}}+\sin \dfrac{\phi ^R}{2} \left[ \cos \theta ^R {{\varvec{i}}}+\sin \theta ^R {{\varvec{j}}} \right] + 0 \,{\varvec{k}}. \end{aligned}$$
(7)

The pair \((q^L, q^R)\) is thus an element of \(\boldsymbol {S^3} \times \boldsymbol {S^3}\). Note that the gaze directions corresponding to the left and the right eye are given by

$$\begin{aligned} g_L&=(\sin \theta ^L \sin \phi ^L - \cos \theta ^L \sin \phi ^L \cos \phi ^L), \end{aligned}$$
(8)
$$\begin{aligned} g_R&=(\sin \theta ^R \sin \phi ^R - \cos \theta ^R \sin \phi ^R \cos \phi ^R). \end{aligned}$$
(9)

Let us now assume that the left and the right eye has centers separated by the vector \(e=(x,y,z)=(0,1,0)\) as shown in Fig. 4. The figure shows that the centers of the left and the right eyes are located respectively at (0, 0, 0) and (0, 1, 0). The configuration space of the binocular system is now described by imposing that the vectors \(g_L, g_R\) and e are coplanar. Such a co-planarity condition will impose that the gaze directions of the left and the right eyes always meet at a point. Thus we have

$$\begin{aligned} {} \cos \phi ^R \sin \theta ^L \sin \phi ^L= \cos \phi ^L \sin \theta ^R \sin \phi ^R. \end{aligned}$$
(10)

We denote by \(\mathbf{LBIN}\) (L stands for Listing, BIN stands for Binocular), the subset of \(\mathbf{ SO (3)} \times \mathbf{ SO (3)}\) where the orientation matrices separately obey Listing’s Law and together the corresponding gaze directions satisfy the co-planarity condition (10). Equivalently, \(\mathbf{LBIN}\) is a subset of \(\mathbf{LIST} \times \mathbf{LIST}\) (see [8]) where the gaze directions of each component satisfy (10). Let \(\rho\) be the mapping

$$\begin{aligned} {} \rho :[0,\pi ] \times [0,2\pi ] \times [0,\pi ] \rightarrow \boldsymbol {S^3} \times \boldsymbol {S^3} \end{aligned}$$
(11)

described as

$$\begin{aligned} {} \rho (\theta ^L,\phi ^L,\theta ^R) = \left[ \left[ \begin{array}{c} \cos {\dfrac{\phi ^L}{2}}\\ \cos {\theta ^L}\sin {\dfrac{\phi ^L}{2}}\\ \sin {\theta ^L}\sin {\dfrac{\phi ^L}{2}}\\ 0 \end{array}\right] , \left[ \begin{array}{c} \cos {\dfrac{\phi ^R}{2}}\\ \cos {\theta ^R}\sin {\dfrac{\phi ^R}{2}}\\ \sin {\theta ^R}\sin {\dfrac{\phi ^R}{2}}\\ 0 \end{array}\right] \right] , \end{aligned}$$
(12)

where from (10) we have

$$\begin{aligned} {} \phi ^R=\tan ^{-1} \Big \{ \dfrac{\sin \theta ^L \sin \phi ^L}{\sin \theta ^R \cos \phi ^L} \Big \}. \end{aligned}$$
(13)

Let us denote by \({\varvec{P}}\) the ‘yz-plane’ (also called the Transverse Plane), we remark that the right hand side function of (13) is well defined if \(g_L \not \in {\varvec{P}}\) and \(g_R \not \in {\varvec{P}}\), i.e., if the eyes are typically looking above or below, in the quadrant of positive or negative x-axis respectively. On the other hand, assuming that \(g_L \ne (0,0,1)\) and \(g_R \ne (0,0,1)\), then the right hand side function of (13) is not defined if \(g_L \in {\varvec{P}}\) and \(g_R \in {\varvec{P}}\). In Sect. 5 of this paper, we assume that the eyes are always looking in the quadrant corresponding to positive x-axis.

A Riemannian metric on \(\mathbf{LBIN}\) is easily induced from \(\mathbf{ SO (3)} \times \mathbf{ SO (3)}\). We define elements \(g_{ij}\) of the symmetric Riemannian matrix \(G_{LB}\) asFootnote 2

$$\begin{aligned} \left\{ \begin{array}{c} g_{11}=\langle \dfrac{\partial }{\partial \theta ^L},\dfrac{\partial }{\partial \theta ^L}\rangle ,~~ g_{12}=\langle \dfrac{\partial }{\partial \theta ^L},\dfrac{\partial }{\partial \phi ^L}\rangle ,\\ g_{13}=\langle \dfrac{\partial }{\partial \theta ^L},\dfrac{\partial }{\partial \theta ^R}\rangle ,~~ g_{22}=\langle \dfrac{\partial }{\partial \phi ^L},\dfrac{\partial }{\partial \phi ^L}\rangle ,\\ g_{23}=\langle \dfrac{\partial }{\partial \phi ^L},\dfrac{\partial }{\partial \theta ^R}\rangle ,~~ g_{33}=\langle \dfrac{\partial }{\partial \theta ^R},\dfrac{\partial }{\partial \theta ^R}\rangle \end{array}\right. \end{aligned}$$
(14)

and computeFootnote 3 the Riemannian metric g given by

$$\begin{aligned} g = [{{\dot{\theta }}^L}~~ {{\dot{\phi }}^L}~~{\dot{\theta }^R}] G_{LB} \begin{bmatrix} {\dot{\theta }^L}\\ {\dot{\phi }^L}\\ {\dot{\theta }^R} \end{bmatrix} \end{aligned}$$
(15)

asFootnote 4

$$\begin{aligned} {} g =&\Big \{\sin ^2\Big (\dfrac{\phi ^L}{2}\Big )+\dfrac{1}{4}\Big (\dfrac{\partial \phi ^R}{\partial \theta ^L}\Big )^2\Big \}(\dot{\theta }^L)^2 + \dfrac{1}{4}\Big \{1+\Big (\dfrac{\partial \phi ^R}{\partial \phi ^L}\Big )^2\Big \}(\dot{\phi }^L)^2 \nonumber \\&+ \Big \{\sin ^2\Big (\dfrac{\phi ^R}{2}\Big )+\dfrac{1}{4}\Big (\dfrac{\partial \phi ^R}{\partial \theta ^R}\Big )^2\Big \}(\dot{\theta }^R)^2 + \dfrac{1}{2}\Big (\dfrac{\partial \phi ^R}{\partial \theta ^L} \Big )\Big (\dfrac{\partial \phi ^R}{\partial \theta ^R}\Big )\dot{\theta }^R\dot{\theta }^L \nonumber \\&+ \dfrac{1}{2}\Big (\dfrac{\partial \phi ^R}{\partial \phi ^L} \Big )\Big (\dfrac{\partial \phi ^R}{\partial \theta ^R}\Big )\dot{\phi }^L\dot{\theta }^R + \dfrac{1}{2}\Big (\dfrac{\partial \phi ^R}{\partial \phi ^L} \Big )\Big (\dfrac{\partial \phi ^R}{\partial \theta ^L}\Big ){\dot{\phi }^L}{\dot{\theta }^L}. \end{aligned}$$
(16)

4 Euler-Lagrangian formulation of binocular eye movement

Since a Riemannian metric defines kinetic energy on the manifold, we use g in (15) to define the Lagrangian \({\mathcal {L}}=\dfrac{1}{2}~ g\) of the binocular systemFootnote 5. The controlled Euler-Lagrange equations are given by

$$\begin{aligned} \dfrac{\mathrm{d}}{{\mathrm{d}}t}\dfrac{\partial {\mathcal {L}}}{\partial \dot{\mu }} - \dfrac{\partial {\mathcal {L}}}{\partial \mu } = \tau _{\mu }, \end{aligned}$$
(17)

where \(\mu \in \{\theta ^L,\phi ^L,\theta ^R\}\). It follows from [21] that (17) can be written as

$$\begin{aligned} G_{LB}\ddot{\varTheta } + {\dot{G}}_{LB}\dot{\varTheta } - \nabla _{\varTheta }^T{\mathcal {L}} = \tau , \end{aligned}$$
(18)

where \(G_{LB}\) is the Riemannian matrix, \(\nabla _{\varTheta }\) is the gradient \(\Big [ \dfrac{\partial }{\partial {\theta ^L}} ~~\dfrac{\partial }{\partial {\phi ^L}}~~\dfrac{\partial }{\partial {\theta ^R}}\Big ]\), \(\varTheta = [\theta ^L~~\phi ^L~~\theta ^R]^{\mathrm{T}}\), and \(\tau\) is the vector of generalized torques \(\tau _\mu\). Further as [21] describes, we define the external torque vector T, in the inertial coordinate to be

$$\begin{aligned} \tau = M_{LB}^{T}T, \end{aligned}$$
(19)

andFootnote 6

$$\begin{aligned} M_{LB}^TM_{LB} = 4 G_{LB}. \end{aligned}$$
(20)

Remark 1

The columns of the matrix M are called the Euler basis vectors, see [38], where T has been described as the resultant moment relative to the center of mass on the body.

Now we setup our dynamical system for the binocular eye rotation by defining

$$\begin{aligned} Z(t)=[z_1~~z_2~~z_3~~z_4~~z_5~~z_6]^{\mathrm{T}} = [\theta ^L~~{\dot{\theta }^L}~~\phi ^L~~{\dot{\phi }^L}~~\theta ^R~~{\dot{\theta }^R}]^{\mathrm{T}}. \end{aligned}$$
(21)

We require that the states go from some a priori agreed Z(0) to Z(1) while minimizing the control energy in a fixed interval of time

$$\begin{aligned} \int \nolimits _0^1 \dfrac{1}{2}\Vert T\Vert ^2\,{\mathrm{d}}t, \end{aligned}$$
(22)

where T is the vector of external torques on the system given byFootnote 7

$$\begin{aligned} T=[T_x^L~~ T_y^L~~ T_z^L~~ T_x^R~~ T_y^R~~ T_z^R]^{\mathrm{T}}. \end{aligned}$$
(23)

We denote the costate variables by

$$\begin{aligned} \varLambda =[\lambda _1~~\lambda _2~~\lambda _3~~\lambda _4~~\lambda _5~~\lambda _6]^{\mathrm{T}} \end{aligned}$$
(24)

and define the Hamiltonian as

$$\begin{aligned} {\mathcal {H}}(Z,\varLambda )= \varLambda ^T\cdot {\dot{Z}} -\displaystyle {\dfrac{1}{2} T^T T}. \end{aligned}$$
(25)

Using the Hamilton’s equations [13], the system

$$\begin{aligned} \dfrac{\mathrm{d}}{{\mathrm{d}}t} \begin{bmatrix} Z\\ \varLambda \end{bmatrix} = F[z_1~\cdots ~z_6~~\lambda _1~\cdots ~\lambda _6~~ T_x^L~~ T_y^L~~ T_z^L~~ T_x^R~~ T_y^R~~ T_z^R] \end{aligned}$$
(26)

is now obtained. Using equations (18), (19), and (21), one can recast equation (25) as

$$\begin{aligned} {\mathcal {H}}= & \varLambda _1^T [{\dot{z}}_1 - z_2~~ {\dot{z}}_3 - z_4~~ {\dot{z}}_5 - z_6] \\ &+ \varLambda _2^T [ G_{LB}^{-1}(M_{LB}^T T - {\dot{G}}_{LB}{\dot{Z}}_1 + \nabla _{Z_1}^T{\mathcal {L}}) - {\dot{Z}}_2 ] \nonumber \\&-\dfrac{1}{2} T^T T, \end{aligned}$$
(27)

where \(\varLambda _1 = [\lambda _1~~ \lambda _3~~ \lambda _5]^{\mathrm{T}}\), \(\varLambda _2 = [\lambda _2~~ \lambda _4~~ \lambda _6]^{\mathrm{T}}\), \(Z_1 = [z_1~~ z_3~~ z_5]^{\mathrm{T}}\), and \(Z_2 = [z_2~~ z_4~~ z_6]^{\mathrm{T}}\). Finally using the Pontryagin’s Maximum Principle, the expressions for optimal external torques (see [8]) are obtained:

$$\begin{aligned} \dfrac{\partial H}{\partial T} = \varLambda _2^T G_{LB}^{-1}M^T_{LB} - T^T = 0, \end{aligned}$$
(28)

which we write symbolically as

$$\begin{aligned}{}[T_x^L~~ T_y^L~~ T_z^L~~ T_x^R~~ T_y^R~~ T_z^R] = [\lambda _2 ~~\lambda _4 ~~\lambda _6]~G_{LB}^{-1} M_{LB}^T. \end{aligned}$$
(29)

The control torques can now be eliminated from the state space system (26) and we obtain the following dynamical system

$$\begin{aligned} \dfrac{\mathrm{d}}{{\mathrm{d}}t} \begin{bmatrix} Z\\ \varLambda \end{bmatrix} = {\tilde{F}}[Z~~\varLambda ]. \end{aligned}$$
(30)

Since we know only the initial and the final value of Z, we have a two-point boundary value problem. The resulting problem is solved using COMSOL Multiphysics program (see [39]).Footnote 8 The computed Z and \(\varLambda\) variables are plugged in (29) to obtain the optimal vector T, which is denoted by \(T_{\mathrm{BVP}}\). In Sect. 5, the optimal \(T_{\mathrm{BVP}}\) is computed and plotted for ten different eye movement examples. In the following paragraph we describe how the ordinary differential equation (30) is implemented in COMSOL.

COMSOL is a finite-element based software that can be used to solve both ordinary and partial differential equations. It has a graphical-user-interface (GUI), where one can implement the modeling equations and corresponding initial and boundary conditions. In this paper, we use the COMSOL’s Coefficient form PDE module to implement the system of ODEs in (30). The generic PDE takes the form:

$$\begin{aligned} e_a \dfrac{\partial ^2 {\varvec{u}}}{\partial ^2 t} + d_a \dfrac{\partial {\varvec{u}}}{\partial t} + \nabla \cdot (-c \nabla {\varvec{u}} - \alpha {\varvec{u}} + \gamma ) + \beta \cdot \nabla {\varvec{u}} + a{\varvec{u}} = {\varvec{f}}. \end{aligned}$$
(31)

When implementing (30) as a boundary value problem, we set the coefficient matrices \(e_a = d_a = c = \alpha = \gamma = a = 0\) and the unknown vector \({\varvec{u}} = [Z ~~ \varLambda ]^{\mathrm{T}} = [z_1~~\cdots ~~ z_6~~ \lambda _1~~ \cdots ~~\lambda _6]^{\mathrm{T}}\). Further, we choose the parameters \(\beta = I_{12}\) and \(f = {\tilde{F}}[Z~~\varLambda ]\). Finally, the space variable in (31) is taken as time t, and hence, \(\nabla := \dfrac{\partial }{\partial t}\) in (31). In this problem, both boundary conditions correspond to the state variables \(z_1\ldots z_6\), and boundary conditions for the costate variables stay free at both end-points of the time interval. We impose Dirichlet boundary condition for the state variables at \(t=0\) using the first 6 equations and at \(t=1\) using the last 6 equations, and do not specify any boundary conditions for the costate variables.

5 Eye movement scenarios on LBIN explored by simulation

In this section, we present simulation results corresponding to three primary types of eye movements that arise from gazing stationary targets in space: Right-to-Left (RL), Near-to-Far (NF), and Top-to-Bottom (TB), and vice versa. Further, we focus on gazing near targets as well as distant targets. During our simulation, we consider combination of the primary eye movements along with the distance of the gaze. Using the coordinates of the given initial and final target positions, we calculate the corresponding values for the angle variables: \(\theta ^L, \, \phi ^L,\) and \(\theta ^R\) and use the computed angle variables as boundary conditions for the resulting two-point boundary value problem described using (30). The optimal trajectories for the generalized coordinates, velocities, torques, and the external torques are obtained over a unit-interval of timeFootnote 9.

The simulation results consist of solving the two-point boundary value problem for each eye movement scenario given in Figs. 5, 6, 7, 8, 9, 10, 11, 12, 13 and 14. In each figure, we illustrate the optimal variations of (a) generalized coordinates, (b) generalized velocities, (c) generalized torques, and (d) external torques. Further, in subplots (e)–(h), the change in the focused gaze point of the two eyes and its projections on coordinate planes are shown when the gaze switches from one point target to another. Figs. 5, 6, 7 show simulation results for gazing near targets and the remaining Figs. 8, 9, 10, 11, 12, 13 and 14 illustrate results for gazing distant targets.

Fig. 5
figure 5

Example 1 (SV-RL): Target located at a short view. The gaze point moves from Right-to-Left between targets that are located at the points (1, 1, 1) and (1, 0, 1)

Fig. 6
figure 6

Example 2 (SV-NF): Target located at a short view. The gaze point moves from Near-to-Far between targets that are located at the points (1, 0.5, 1) and (1, 0.5, 2)

Fig. 7
figure 7

Example 3 (SV-TB): Target located at a short view. The gaze point moves from Top-to-Bottom between targets that are located at the points (2, 0.5, 1) and (1, 0.5, 1)

Fig. 8
figure 8

Example 4 (LV-RL): Target located at a long view. The gaze point moves from Right-to-Left between targets that are located at the points (3, 2, 4) and \((3,-2,4)\)

Fig. 9
figure 9

Example 5 (LV-NF): Target located at a long view. The gaze point moves from Near-to-Far between targets that are located at the points (3, 2, 4) and (3, 2, 8)

Fig. 10
figure 10

Example 6 (LV-TB): Target located at a long view. The gaze point moves from Top-to-Bottom between targets that are located at the points (7, 2, 4) and (3, 2, 4)

Fig. 11
figure 11

Example 7 (LV-RLNF): Target located at a long view. The gaze point moves from Right-to-Left with Near-to-Far between targets that are located at the points (3, 2, 4) and \((3,-2,8)\)

Fig. 12
figure 12

Example 8 (LV-RLTB): Target located at a long view. The gaze point moves from Right-to-Left with Top-to-Bottom between targets that are located at the points (7, 2, 4) and \((3,-2,4)\)

Fig. 13
figure 13

Example 9 (LV-TBNF): Target located at a long view. The gaze point moves from Top-to-Bottom with Near-to-Far between targets that are located at the points (7, 2, 4) and (3, 2, 8)

Fig. 14
figure 14

Example 10 (LV-TBRLNF): Target located at a long view. The gaze point moves from Top-to-Bottom with Right-to-Left with Near-to-Far between targets that are located at the points (7, 2, 4) and \((3,-2,8)\)

Fig. 15
figure 15

Example 11: The gaze point moves along a vector parallel to gaze direction of the left eye between targets that are located at the points (1, 1, 1) and (3, 3, 3). The simulation shows that the left eye does not move, whereas the right eye does. This movement is not observed in an actual human binocular eye movement, where the left eye is not innervated at all whereas the right eye is

In each of the examples below, the center of the two eyes are located at (0, 0, 0) and (0, 1, 0). The separation between the eyes is along the y-axis and is assumed to be of unit length (see Fig. 4).

Example 1

(SV-RL) The target points are at a ‘short view’ compared to the separation of the two eyes. Two eyes are gazing from right-to-left to move between targets that are located at the points (1, 1, 1) and (1, 0, 1). Hence, both eyes have to primarily rotate about the upward-pointing axis (x-axis) in the counterclockwise direction (see Fig. 5h). In Fig. 5d it is also observed that the torques \(T_x^L\) and \(T_x^R\) are applied about the x-axis. In Fig. 5f it is observed that the eyes rotate in the clockwise direction about the forward-pointing z-axis. Thus the torques \(T_z^L\) and \(T_z^R\) being applied about the z-axis in the clockwise direction are depicted in Fig. 5d with a sign opposite to that of \(T_x^L\) and \(T_x^R\).

Example 2

(SV-NF) The target points are at a ‘short view’ compared to the separation of the two eyes. Two eyes are gazing from near-to-far to move between targets that are located at the points (1, 0.5, 1) and (1, 0.5, 2). Hence, both eyes have to primarily rotate about the right-pointing axis (y-axis) in the clockwise direction (see Fig. 6g). In Fig. 6d it is also observed that the torques \(T_x^L\) and \(T_x^R\) are applied about the x-axis. In Fig. 6h it is observed that the eyes rotate in opposite direction about the upward-pointing x-axis. Hence \(T_x^L\) and \(T_x^R\) have opposite signs.

Example 3

(SV-TB) The target points are at a ‘short view’ compared to the separation of the two eyes. Two eyes are gazing from top-to-bottom to move between targets that are located at the points (2, 0.5, 1) and (1, 0.5, 1). Therefore, both eyes have to primarily rotate about the right-pointing axis (y-axis), and the direction of rotation about the y-axis should be in the clockwise direction for both eyes, and hence, the initial signs of the external torques, about the y-axis, are negative. This behavior can be observed in Fig. 7d as larger external torques are applied about the y-axis: \(T_y^L\) and \(T_y^R\). Interestingly, in contrast to Example 2, both eyes experience nonzero external torque values along the z-axis.

Example 4

(LV-RL) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from right-to-left to move between targets that are located at the points (3, 2, 4) and \((3,-2,4)\). As depicted in Fig. 8d and h, the primary axis of rotation is the upward-pointing x-axis, and hence, we observe larger external torques around it. In Fig. 8f, it is observed that the eyes rotate in the clockwise direction about the forward-pointing z-axis. Thus the torques \(T_z^L\) and \(T_z^R\) being applied about the z-axis in the clockwise direction is depicted in Fig. 8d with a sign opposite to that of \(T_x^L\) and \(T_x^R\).

Example 5

(LV-NF) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from near-to-far and moving between targets that are located at the points (3, 2, 4) and (3, 2, 8). Hence, both eyes are primarily rotating about the right-point y-axis in the clockwise direction (see Fig. 9g). In Fig. 9d, it is also observed that the torques \(T_x^L\) and \(T_x^R\) are applied about the x-axis. Using the locations of the two eyes and from Fig. 9h, it is observed that both eyes rotate in the counterclockwise direction about the upward-pointing x-axis.

Example 6

(LV-TB) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from top-to-bottom to move between targets that are located at the points (7, 2, 4) and (3, 2, 4). Therefore, again, both eyes have to primarily rotate about the right-pointing axis (y-axis), and as illustrated in Fig. 10g, the direction of rotation about the y-axis should be in the clockwise direction for both eyes, and hence, the initial sign of the external torques is negative. This behavior can be observed in Fig. 10d as larger external torques are applied about the y-axis: \(T_y^L\) and \(T_y^R\). It is also observed from Fig. 10f that the eyes rotate about the z-axis in the same direction (counterclockwise). Further, Fig. 10h indicates the presence of a rotation of two eyes about the upward-pointing x-axis.

Example 7

(LV-RLNF) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from right-to-left with near-to-far (RLNF) to move between targets that are located at the points (3, 2, 4) and \((3,-2,8)\). In this eye movement scenario, as shown in Fig. 11d, g, and h, both eyes primarily rotate about the upward-pointing axis (x-axis) as well as the right-pointing axis (y-axis). In the former axis, the eyes rotate in the counterclockwise direction and in the latter axis the rotation is about the clockwise direction. Moreover, the largest external torques are applied about the x-axis: \(T_x^L\) and \(T_x^R\). It is also observed from Fig. 11f that the eyes rotate about the z-axis in the clockwise direction yielding negative initial external torques \(T_z^L\) and \(T_z^R\).

Example 8

(LV-RLTB) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from right-to-left with top-to-bottom (RLTB) to move between targets that are located at the points (7, 2, 4) and \((3,-2,4)\). Similar to RLNF, both eyes primarily rotate about the upward-pointing axis (x-axis) as well as the right-pointing axis (y-axis). As illustrated in Fig. 12d, g, and h, the eyes rotate about the x-axis in the counterclockwise direction resulting positive initial torques: \(T_x^L\) and \(T_x^R\). Further, it is also observed from Fig. 12f that the eyes rotate about the z-axis in the clockwise direction yielding negative initial external torques \(T_z^L\) and \(T_z^R\).

Example 9

(LV-TBNF) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from top-to-bottom with near-to-far (TBNF) to move between targets that are located at the points (7, 2, 4) and (3, 2, 8). In this eye movement scenario, Fig. 12d and g indicate that both eyes primarily rotate about the right-pointing axis (y-axis) in the clockwise direction resulting in negative initial external torques: \(T_y^L\) and \(T_y^R\). Interestingly, Fig. 12f and h show counterclockwise rotations about the x- and z-axes.

Example 10

(LV-TBRLNF) The target points are at a ‘long view’ compared to the separation of the two eyes. Two eyes are gazing from top-to-bottom with right-to-left and near-to-far (TBRLNF) to move between targets that are located at the points (7, 2, 4) and \((3,-2,8)\). Figure 14d indicates that both eyes primarily rotate about the x- and y-axes and possess larger initial external torques about those two axes. Further, Fig. 14f, g, and h show a counterclockwise rotation about the x-axis and clockwise rotations about the remaining two axes for both eyes.

For each of Examples 110, in Table 1 we have entered the initial and the final points of the target, the Euclidean distance between the two points together with the arc length, the energy spent to make the optimal move and the corresponding energy spent per unit arc length. With these data, the different examples can now be compared as has been done in Sec. 6.

Table 1 Ten different gaze movements on LBIN combining short and long view from examples 1 to 10

The computed values of the optimal external torque function (let us call this \(T_{\mathrm{BVP}}(t)\)), for each of the ten examples 110 have been plotted in Figs. 5d, 6d, 7d, 8d, 9d, 10d, 11d, 12d, 13d and 14d. In Table 2, we have noted down all the six components of \(T_{\mathrm{BVP}}(0)\), which is the initial value of the optimal external torque function. The table lists the initial torque vector for each of the ten examples.

Table 2 Initial values of the optimal external torques \(T_{\mathrm{BVP}}(0)\)

6 Discussion

6.1 Discussion on the simulation results for Examples 110

Examples 13, in Figs. 5d, 6d, 7d, show that both the eyes experience the same magnitude in the corresponding external torque components, i.e., \(|T_x^L| = |T_x^R|,\, |T_y^L| = |T_y^R|,\) and \(|T_z^L| = |T_z^R|\). The cause for this property is due to the symmetry between the two eyes and the initial and final target positions. Examples 46, in Figs. 8, 9, 10 correspond to gazing distant targets and exhibit similar qualitative properties in the variation of the external torques as in the first three examples. However, we observe a slight difference in the corresponding external torques (\(T_{i}^L \ne T_{i}^R\) for \(i \in \{x,y,z\}\)) that arise due to the asymmetry in the configurations of the left and right eyes with respect to the initial and final target positions. We make the following observations:

  • Optimal path that includes ‘Moving Right to Left’ is not a straight line: In Examples 1, 4, 7, 8 and 10 wherein the initial and the final points involve movements in the ‘Right to Left direction’, the arc lengths are higher than the straight line distance between the initial and the final target positions (see Table 1). For side wise movement of the gaze, optimal arc is not straight.

  • Optimal ‘Near to Far and Top to Bottom Movements’ are along a straight line: In Examples 2, 3, 5, 6 and 9 wherein the initial and the final points do not involve movements in the ‘Right to Left direction,’ the arc lengths are close to the straight line distance between the initial and the final target positions (see Table 1). Said differently, optimal arcs in the xz-plane are significantly straight.

  • ‘Optimal movements between targets at a short view are more energy expensive,’ compared to targets that are further away. If we compare energy spent per unit arc length, for Examples 1 and 4, 2 and 5, 3 and 6, using Table 1, we will notice the validity of this statement.

  • ‘Right to Left movements are more expensive’ compared to Near to Far and Top to Bottom movements. This fact is evident by comparing Examples 1 with 2 and 3, and Examples 4 with 5 and 6 (see Table 1, energy spent per unit arc length.) When the primary eye movements are combined, this deficiency is not particularly visible (see Table 1, rows \(7-10\)).

  • In many of the examples, the components \(T_x\), \(T_y\) and \(T_z\) of the optimal torques for the left and the right eyes satisfy Herring’s LawFootnote 10. It is evident from Figs. 5d, 6d, 7d, 8d (Examples 14) that the \(T_x,\) \(T_y\) and \(T_z\) components of the external torque vector are the same time function, up to sign.

Regarding satisfaction of Herring’s Law, exception occurs in Example 5, see Fig. 11d, wherein the \(T_x\) components are substantially different between the left and the right eyes. In Example 6, see Fig. 10d, the \(T_z\) and the \(T_x\) components do not have precisely the same magnitude, whereas \(T_y\), the strongest component does have the same magnitude between the two eyes. Similar comments can be made for other examples. It is indeed quite interesting to note that in Example 10, see Fig. 14d, wherein the eye movement has all the three primary components, Herring’s law is satisfied. The three external torque components in the two eyes are almost identical. In Example 11, to be discussed in Sect. 6.2, the initial and the final points are chosen in such a way that the left eye does not have to move and only right eye movement is sufficient to switch the gaze from an initial to a final point. In this example, we show that optimal eye movement does not support Herring’s Law.

6.2 Herring’s and Helmholtz’ controversy

In order to decide whether, from the point of view of optimal control, it makes sense to assume that the left and the right eye are simultaneously innervated or the innervations to the two eyes are separate. In [34], a binocular eye movement experiment was conducted in which the initial point, the final point and the center of the left eye were in one line. The binocular gaze was shifted from the initial to the final point. In this experiment, the left eye does not need to move at all, whereas the right eye needs to rotate. If the eyes were innervated separately, only the right eye is required to move, as was proposed by Helmholtz. In the experiment, however, the eye movement was observed to be split into a cascade of vergence and version movements. In Example 11, the experiment from [34] was repeated from the point of view of optimal control. In Example 12, the vergence/version decomposition of the Example 11 has been described. In Example 13, the vergence/version decomposition of the Example 10 has been described. The details exhibited in Figs. 16 and 17 are described as follows.

Fig. 16
figure 16

Example 12: Target located at a mid range. Vergence eye movement followed by a small version

Fig. 17
figure 17

Example 13: Target located at a distant range. A small vergence eye movement followed by a version

Example 11

The gaze of the binocular eyes is rotating between points (1, 1, 1) and (3, 3, 3). The two eyes are centered as before at points (0, 0, 0) and (0, 1, 0). As shown in Fig. 15, the optimal gaze trajectory is a straight line wherein the left eye does not rotate at all and the right eye changes its gaze from the initial to the final point. Thus the optimal eye movement corroborates Helmholtz’s prediction of how the eyes are controlled, i.e., they are innervated separately.

Example 12

(vergence/version) We now split the eye movement in Example 11 into a cascade of ‘vergence’ followed by ‘version’, as indicated in [34]. It would follow that under vergence the gaze point moves between (1, 1, 1) and (3.28, 2.14, 3.28).Footnote 11 It is evident from Fig. 16d that the two eyes move in opposite directions during the vergence phase of the eye movement, indicated by the opposite signs of \(T_z^L\) and \(T_z^R\). We assume that the vergence movement is completed in the time interval \([0,\dfrac{1}{2}]\). Next we consider version during the interval \([\dfrac{1}{2},1]\) and the gaze point moves between (3.28, 2.14, 3.28) and (3, 3, 3). It is evident from Fig. 16 that this time the two eyes move in the same direction (indicated by the same signs of \(T_z^L\) and \(T_z^R\)).

Example 13

(vergence/version) In this example, we consider the vergence/version decomposition of the eye movement chore from Example 10. The target points in Example 10 were at (7, 2, 4) and \((3,-2,8)\). The intermediate point F is computed as (7.6, 2.14, 4.34) using the procedure outlined in Example 12. The vergence and the version movements are sketched in Fig. 17, along with the external torques (see Table 3 as well). It is evident from Fig. 17e that in this example, the vergence movement is small compared to version. The magnitude of the external torque vector for vergence is considerably small (see Fig. 17d) compared to version. The actual numbers are in the last row of Table 3. The optimal control is computed by splitting the total time range [0, 1] into two equal intervals \(\left[ 0,\dfrac{1}{2}\right]\) and \(\left[ \dfrac{1}{2},1\right]\). We would like to note that perhaps a different choice such as \([0,\mu ]\), \([\mu ,1]\) for small value of \(\mu\) would reduce the total energy requirementFootnote 12.

Table 3 Arc length and energy comparison when eye movements are split up into vergence and version. Eye movement in Example 10 is split up in Example 13, and Example 11 is split up in Example 12

Remark 2

When an optimal gaze movement is split up between vergence and version, the expended energy rises rapidly. This is evident from Examples 12 and 13, wherein we observe that by and large, the version movements are energy inefficient.

6.3 The structure of the optimal controller

Recall that for eyes centered on the y-axis and for rotation matrices in \(\mathbf{LBIN}\) (as introduced in Sect. 3), the optimal external torque function obtained in (29) has an interesting linear structure. For each of the Examples 111, the optimal external torque function (see Figs. 5d, 6d, 7d, 8d, 9d, 10d, 11d, 12d, 13d, 14d, and 15d), is of the form

$$\begin{aligned} T_{\mathrm{BVP}}(t) ~ \approx ~ T_{\mathrm{BVP}}(0)(1-2t), \end{aligned}$$
(32)

where \(t \in [0,1]\). The acronym ‘BVP’ stands for Boundary Value Problem, and \(T_{\mathrm{BVP}}(t)\) is the optimal external torque controller obtained by solving the ‘BVP’ outlined in (30Footnote 13.

Note that in (29) we had

$$\begin{aligned} T_{\mathrm{BVP}}(t) ~=~ [\lambda _2(t) ~ \lambda _4(t) ~ \lambda _6(t)] W(t), \end{aligned}$$
(33)

where

$$\begin{aligned} W(t) ~=~ G_{\mathrm{LB}}^{-1} M_{\mathrm{LB}}^T \end{aligned}$$

is a \(3 \times 6\) matrix and (33) indicates that \(T_{\mathrm{BVP}}(t)\) is in the row span of W(t), for \(t \in [0,1]\). If \(T_{\mathrm{BVP}}(t)\) is indeed of the form (32), it would follow that there exist one vector \(T_{\mathrm{BVP}}(0)\) that is contained in the row span of W(t), \(t \in [0,1]\), i.e.,

$$\begin{aligned} T_{\mathrm{BVP}}(0) \in \textstyle \bigcap \limits _{t \in [0,1]} \text {row span of} ~ W(t). \end{aligned}$$
(34)

We now state and prove the following proposition.

Proposition 1

Assume that \(T_{\mathrm{BVP}}(t)\) satisfies (32), if \(t_1\) and \(t_2\) be such that \(t_1 \ne t_2\) and \(t_1, t_2 \in [0,1]\) and we define a \(6 \times 6\) matrix

$$\begin{aligned} L =\left[ \begin{array}{c} W(t_1) \\ W(t_2) \end{array} \right] , \end{aligned}$$
(35)

it follows that \({\mathrm{rank}} ~ L ~ < ~ 6\).

Proof of Proposition 1

From (32) and (33) or equivalently from (34) it would follow that there exists a nonzero vector \(T_{\mathrm{BVP}}(0)\) contained in row span of \(W(t_1)\) and row span of \(W(t_2)\). It follows that there exists nonzero vectors \((a_1, b_1, c_1)\) and \((a_2, b_2, c_2)\) such that

$$\begin{aligned} (a_1, b_1, c_1) W(t_1) ~=~ (a_2, b_2, c_2) W(t_2). \end{aligned}$$

Hence

$$\begin{aligned} (a_1, b_1, c_1, -a_2, -b_2, -c_2)~L~=~0. \end{aligned}$$

\(\square\)

Remark 3

For each of Examples 110, we have computed in Table 4 the singular values of the matrix L (see (35)) choosing \(t_1=0\) and \(t_2= 0.25, 0.50, 0.75, 1.00\). Each row in the table lists out the 6 singular values. There are 4 pairs of \((t_1, t_2)\) for each example contributing to 4 rows. Ideally if (32) is perfectly satisfied, then the smallest singular value should be zero. This would imply that the last column of Table 4 should ideally be zero.

Table 4 Singular values from the matrix in (35), where \(t_1=0\) and \(t_2=0.25, 0.50, 0.75, 1\). Each row of the table lists out 6 singular values for 4 pairs of \((t_1, t_2)\). There are 10 examples in all

Remark 4

In Table 2, the vector \(T_{\mathrm{BVP}}(0)\) is listed for each of the ten examples. The equation (32) indicates that these vectors completely determine the optimal external torques function. To verify that the control input \(T_{\mathrm{BVP}}(t)\) to the dynamical system (18), (19), we solve the corresponding initial value problem. In Fig. 18, we display the generalized coordinates from the initial value problem and the boundary value problem from Example 10 (see Fig. 14a), showing that the co-ordinates match perfectly. The matching has been performed for all the other examples as well, but the results are not displayed.

Fig. 18
figure 18

Comparison of the generalized coordinates obtained from solving the boundary value problem in Example 10 and the corresponding initial value problem with the control input \(\boldsymbol{T}(t) =\boldsymbol{T}_{\mathrm{BVP}}(0)(1-2t)\). This simulation has been carried out for Examples 110 and the generalized angle coordinates for the boundary value problem perfectly match the initial value problem

Consider an optimal control problem where the goal is to move from a point C to an arbitrary point \(D_i\) (see Fig. 19). If we assume that the initial point C is fixed and the terminal point \(D_i\) is arbitrary we consider the the following map:

$$\begin{aligned} \chi : \mathbb {R}^3 \rightarrow \mathbf {Gr}(3,6), \end{aligned}$$
(36)

where

$$\begin{aligned} C ~ \longmapsto ~ {\mathrm{row}} ~ \text {span} ~ \text {of} ~ W(0). \end{aligned}$$

In (36) the notation \(\mathbf {Gr}(3,6)\) refers to the Grassmannian manifold [40] of homogeneous 3-planes in \(\mathbb {R}^6\).

Fig. 19
figure 19

Optimal control problem where the goal is to move from a point C to an arbitrary point \(D_i\)

The following proposition pertains to the optimal external torque function, which transfers the gaze optimally from C to \(D_i\) and is denoted by \(T_{CD_i}\). Let us denote this function by \(T_{CD_i}(t)\) Footnote 14. We now state the following proposition.

Proposition 2

Let us assume that the optimal external torque function \(T_{CD_i}(t)\) satisfy (32) for every \(D_i\) in \(\mathbb {R}^3\), it follows that

$$\begin{aligned} T_{CD_i}(t) \in \chi (C), \end{aligned}$$

for every \(t \in [0,1]\).

Proof of Proposition 2

It would follow from (33) thatFootnote 15

$$\begin{aligned} T_{CD_i}(0) = [\lambda _2(0) ~ \lambda _4(0) ~ \lambda _6(0)]_{CD_i} W(0). \end{aligned}$$

This would imply that \(T_{CD_i}(0) \in \chi (C)\). Using the fact that the external torque function \(T_{CD_i}(t)\) satisfy (32), it would follow that \(T_{CD_i}(t) \in \chi (C)\). \(\square\)

To end this section we would like to make the following remark.

Remark 5

The image of the map \(\chi\) in (36) can be described as follows

$$\begin{aligned} \hbox {row span of } W(0) = \hbox {row span of } M^T_{LB}. \end{aligned}$$

7 Conclusions

Using Riemannian formulation proposed in this paper, our main goal is to propose and solve a binocular optimal control problem, wherein a pair of visual sensors rotate to focus attention from one target to another. The associated dynamical system (26) is written out using the Euler-Lagrange equations as a control system with ‘External Torque’ as input. The goal is to minimize a suitable quadratic control energy function while the binocular system changes focus from one target to another. Various eye movement chores were simulated in Sect. 5 and their optimal gaze trajectories were computed. In Examples 12 and 13, the optimal gaze trajectory is split up into an optimal vergence and version movement. We observe that the ‘vergence/version’ movements are not energy efficient and note this fact in Table 4. In this table, it is noted that energy requirement for two consecutive transfers, vergence followed by version, is significantly higher than the optimal energy requirement for a direct transfer. We conclude this paper by emphasizing two particularly interesting observations that we were able to demonstrate in our simulations in Sect. 5. The first one is that ‘Eye movement chores side to side, i.e., from right to left and back are energy expensive,’ as opposed to eye movements along the xz-plane (see Table 1, where we observe that in Examples 1 and 4 energy per unit arc length numbers are higher compared to the corresponding numbers for Examples 2, 3, 5 and 6.). The second one is that ‘The optimal external torque function is linear in time’ (see Sect. 6.3). Somewhat surprisingly, we observe that the ‘initial value of the external torque vector completely determines the entire function in the interval [0, 1]’ and the choice of this vector is determined by the initial and the final position of the gaze point in \(\mathbb {R}^3\). Moreover the initial value of the external torque vector lies in a fixed 3-plane in \(\mathbb {R}^6\) determined by the initial value of the gaze. Changing the initial gaze point alters the 3-plane in \(\mathbb {R}^6\) as a point in the Grassmannian manifold \(\mathbf {Gr}(3,6)\) Footnote 16 (see [40]).

We do not have a specific explanation for the two simulation based observations described here. However, they do have consequences in terms of how visual exploration could be carried out for the binocular system of eyes. If there is a set of n visual targets in space, for some positive integer n, one would like to design a search scheme to sequentially explore the targets. A smart strategy would be avoid, as much as possible, Right to Left and Left to Right movements, because these are energy inefficient. Finally, the linearity of the external optimal torque function has computational consequences. To compute these functions, it is enough to compute \(T_{BVP}(0)\), the optimal torque vector at the initial time \(t=0\). In principal one can compute these vectors over a discrete set of target points and use interpolation for other intermediate values. This simplifies the optimal controller computation as opposed to solving the boundary value problem for every set of target points.

As a final remark we note that optimal binocular eye movement is an important problem to study in order to design machines that can visually explore a terrain optimally. A limitation of the proposed research is that - optimal controllers we design are typically implemented in an open loop and the computation requires solving a two point boundary value problem, using COMSOL that may not be readily available. Such controllers are perhaps not robust and keeping the final target point stable could become an issue. As a possible subject for future work, we propose to compare optimal controllers synthesized in this paper with other stabilizing, but possibly non-optimal controllers.