Journal of Computational Neuroscience

, Volume 28, Issue 1, pp 47–64

Motion processing with wide-field neurons in the retino-tecto-rotundal pathway


    • Bernstein Center for Computational NeuroscienceMax-Planck-Institute for Dynamics and Self-Organization
    • Institut de Robòtica i Informàtica Industrial (CSIC-UPC)
  • Ralf Wessel
    • Physics DepartmentCB 1105, Washington University in St. Louis
  • John W. Clark
    • Physics DepartmentCB 1105, Washington University in St. Louis
  • Florentin Wörgötter
    • Bernstein Center for Computational Neuroscience III. Physikalisches Institut-BiophysikGeorg-August Universität Göttingen
Open AccessArticle

DOI: 10.1007/s10827-009-0186-y

Cite this article as:
Dellen, B., Wessel, R., Clark, J.W. et al. J Comput Neurosci (2010) 28: 47. doi:10.1007/s10827-009-0186-y


The retino-tecto-rotundal pathway is the main visual pathway in non-mammalian vertebrates and has been found to be highly involved in visual processing. Despite the extensive receptive fields of tectal and rotundal wide-field neurons, pattern discrimination tasks suggest a system with high spatial resolution. In this paper, we address the problem of how global processing performed by motion-sensitive wide-field neurons can be brought into agreement with the concept of a local analysis of visual stimuli. As a solution to this problem, we propose a firing-rate model of the retino-tecto-rotundal pathway which describes how spatiotemporal information can be organized and retained by tectal and rotundal wide-field neurons while processing Fourier-based motion in absence of periodic receptive-field structures. The model incorporates anatomical and electrophysiological experimental data on tectal and rotundal neurons, and the basic response characteristics of tectal and rotundal neurons to moving stimuli are captured by the model cells. We show that local velocity estimates may be derived from rotundal-cell responses via superposition in a subsequent processing step. Experimentally testable predictions which are both specific and characteristic to the model are provided. Thus, a conclusive explanation can be given of how the retino-tecto-rotundal pathway enables the animal to detect and localize moving objects or to estimate its self-motion parameters.


Visual motionRetino-tecto-rotundal pathwayOptic tectumNucleus rotundusOptic flow

1 Introduction

Visual motion is one of the most important cues enabling the animal to interact with its environment in a meaningful way. The computation of local velocity is the basis for such important tasks as the detection and location of independently moving objects and the estimation of self-motion parameters. Not surprisingly, neurons sensitive to visual motion have been found in many brain areas (Nakayama 1985; Dellen and Wessel 2008). However, computational models of motion processing have focused mainly on neurons in the thalamocortical pathway (Adelson and Bergen 1985; Heeger 1988; Wörgötter and Koch 1991; Hennig et al. 2002) and are specific to the small and periodic receptive fields of V1 neurons. The situation is different for motion-sensitive neurons in the retino-tecto-rotundal system, which constitutes the main visual pathway in non-mammalian vertebrates. Their response properties are generally incompatible with these kinds of models due to their entirely different organization (see Fig. 1).
Fig. 1

(a) Retino-tecto-rotundal pathway. (b) Schematic of network connectivity of the retino-tecto-rotundal pathway. (c) Reconstruction of a tectal neuron (SGC-I) (Luksch et al. 1998). (d) Reconstructed dendritic field of a tectal neuron (Mahani et al. 2006)

Retinal axons belonging to the retino-tecto-rotundal pathway project in a precise retinotopical manner to the optic tectum (TO) (Mpodozis et al. 1996; Karten et al. 1997). The tectofugal projection arises exclusively from cells in the tectal layer 13 or stratum griseum centrale (SGC) and targets the thalamic nucleus rotundus (Rt) without maintaining a retinotopic organization (Fig. 1(a, b)) (Benowitz and Karten 1976; Engelage and Bischof 1993; Mpodozis et al. 1996; Marin et al. 2003). Nevertheless, deficits in pattern-discrimination tasks and the dramatic postlesional threshold variations in acuity measurements point to the existence of a system with high spatial resolution (Hodos and Karten 1966; Hodos 1969; Hodos and Bonbright 1974; Mulvanny 1979; Hodos et al. 1984; Macko and Hodos 1984; Bessete and Hodos 1989; Watanabe 1991; Güntürkün and Hahmann 1999; Laverghetta and Shimizu 1999; Nguyen et al. 2004). Hodos and Karten (1966) conducted behavioral experiments with pigeons which were trained to peck one of two discs on which visual stimuli were projected. They observed that lesions in the nucleus rotundus caused severe deficits in performance in brightness- and pattern-discrimination tasks. Later on, Laverghetta and Shimizu (1999) showed that lesions in the nucleus rotundus impaired the detection of small moving stimuli. Furthermore, lesions in the caudal ectostriatum, which is the telencephalic target of the tectofugal visual pathway, also caused severe to moderate deficits in visual acuity and motion processing tasks (Hodos et al. 1984; Nguyen et al. 2004).

Tectal neurons with somata in tectal layer 13 have large circular receptive fields spanning ≈ 10 − 60 degrees of the visual field (Luksch et al. 1998; Wu et al. 2005; Schmidt and Bischof 2001). A reconstruction of a representative neuron in the optic tectum is shown in Fig. 1(c). The distribution of dendritic endings is sparse, such that the summed receptive fields of the dendritic endings fill less than 1% of the total receptive field (Mahani et al. 2006). The anatomical organization corresponds to a spotty receptive-field fine structure (Troje and Frost 1998; Letelier et al. 2002; Mahani et al. 2006; Schmidt and Bischof 2001) (Fig. 1(d)). Tectal neurons respond vigorously to small moving stimuli, but they are only weakly selective for the orientation or direction of motion of the stimulus (Frost and Nakayama 1983; Sun et al. 2002). In the tecto-rotundal projection, rotundal neurons receive input from tectal neurons distributed throughout the entire tectum (Fig. 1(b)), whereby the precise point-to-point topography of the retino-tectal projection is completely lost (Benowitz and Karten 1976; Ngo et al. 1994; Karten et al. 1997; Deng and Rogers 1998; Hellmann and Güntürkün 2001; Marin et al. 2003). The tecto-rotundal projection is currently interpreted as implementing a transformation from a retinotopically-organized map into a functionally-organized map (Hellmann and Güntürkün 2001). The several anatomical subdivisions of the Rt correlate with neural populations that respond specifically to different visual modalities, such as two-dimensional motion and in-depth motion (Revzin 1970; Wang and Frost 1990; Wang et al. 1993).

From a theoretical point of view, the following questions arise: (i) How is spatial information organized in the retino-tecto-rotundal pathway, in view of the sparse but extensive receptive and dendritic fields of the neurons? (ii) How is sensitivity to direction of motion generated in the rotundus largely in the absence of periodically arranged subunits that account for motion sensitivity in neural models of other brain areas such as V1 or MT? (iii) How can local velocity estimates be retrieved from motion-sensitive neurons, i.e. rotundal neurons, that have receptive fields spanning up to 120 deg of visual angle?

The paper is structured as follows. In Section 2, we propose a model of the retino-tecto-rotundal pathway and investigate theoretically the spatial organization of this pathway. We also propose a neural network for the extraction of local-velocity fields from the rotundal neural population. In Section 3, we establish by means of computer simulations that the proposed model accounts for motion-sensitive responses of neurons in the optic tectum and the nucleus rotundus. We further provide experimentally testable predictions and demonstrate that local-velocity fields can be computed from the responses of rotundal model neurons. Finally, in Section 4, the results of the model are discussed and directions for future research are indicated.

2 Model

In this section, the retino-tectal-rotundal-pathway model is defined and its properties are analyzed theoretically. The basic organization is described in Section 2.1, then, motion-sensitive mechanisms of rotundal model neurons are described in Section 2.2. In Section 2.3 we present a neural model for extracting local-velocity fields from rotundal neurons, serving as a proof of concept. In Section 2.4, the connectivity constraints of the model at the tecto-rotundal projection are explored theoretically. In Sections 2.52.7, a preprocessing filter is described, model parameters used in computer simulations are characterized, and an error measure for performance evaluation is defined, respectively.

2.1 Basic organization of the retino-tecto-rotundal pathway

We propose a firing-rate model of the retino-tecto-rotundal pathway. The tectal and rotundal neurons are modelled as summing units that integrate the responses of input neurons, followed by a rectification of the signal. The receptive/dendritic field properties of tectal and rotundal neurons are constructed from anatomical and electrophysiological data, and as such provide parameters to the model. Sensitivity to stimulus velocity is introduced into the model by including temporal filters at the stage of the tecto-rotundal projection in Section 2.2.

Let us model the response to a 2D stimulus I(x,t) of a tectal neuron i through a continuous, time-dependent firing-rate function
$$ r_{tc}^i(t) = \left[\int_{A} R_{tc}^i(\mathbf{x})I(\mathbf{x},t) d\mathbf{x}\right]_+ , \label{r_tc} $$
where A is the area of the visual field, \(R_{tc}^i(\mathbf{x})\) is the receptive field, x = (x,y) is the position vector, and t is the temporal dimension of the visual input. The visual input, integrated over the receptive field, is rectified (rectification is being symbolized by [ ] + ). According to the rectification model, [a] +  = a if a > τ, and zero otherwise, where τ is a threshold parameter (Granit et al. 1963). The functional form of the model neurons is chosen according to standard firing-rate models (Dayan and Abbott 2005). A schematic of the sparse and random connectivity of a tectal neuron is presented in Fig. 2 (colored in red). The receptive fields of the model tectal neurons are assumed to have a spotty and random fine structure in accordance with experimental data. The parameters of the model are specified in Section 2.6 and Fig. 4.
Fig. 2

Schematic of a rotundal model neuron. Tectal cells (TC) in the deep layers of the optic tectum integrate the responses of retinotopically organized retinal ganglion cells (RGC) in a sparse and random fashion. A representative tectal cell is depicted in red. The spatial integration is followed by a rectification. At the nucleus rotundus, the responses of subpopulations of tectal cells are summed via intermediate rotundal units (IRU), which could be subsets of synapses, or dendritic branches. A representative intermediate rotundal unit is depicted in green. The rotundal cell response is modelled by summing over the responses of the intermediate rotundal units, followed by a rectification. Sensitivity to stimulus velocity is introduced into the model by temporally filtering the subpopulation responses

We compute the Fourier transform of the receptive field \(R_{tc}^i(\mathbf{x})\) of each model tectal cell i. Multiplying the Fourier transforms of the tectal-cell receptive fields of ntc tectal neurons by their respective firing-rate functions and summing, we obtain a representation (or map) of tectal responses in Fourier space
$$ M_{tc}(\mathbf{k},t) = \overset{n_{tc}}{\sum\limits_{i=1}} R_{tc}^i(\mathbf{k})r_{tc}^i(t) , \label{m_to} $$
where \(R_{tc}^i(\mathbf{k})=F[R_{tc}^i(\mathbf{x})]\) is the spatial Fourier transformation of \(R_{tc}^i(\mathbf{x})\) and k = (kx,ky) is a wave vector with spatial frequencies kx and ky. It is important to note that the map Mtc(k,t) is only implicitly defined through the population of tectal neurons. This map serves as a mental construct in tracking the functional processing path of the neural system being modeled. In Section 3.1, we will employ computer simulations to show that for large numbers of tectal neurons, the approximation
$$ M_{tc}(\mathbf{k},t) \approx F\left[I(\mathbf{x},t)\right] $$
is applicable (up to a scaling factor).
Each spatial-frequency component of Mtc(k,t) corresponds to a subpopulation of tectal neurons, while each tectal neuron can be a member of more than one subpopulation. We assume that rotundal neurons receive input from these subpopulations via an intermediate rotundal unit (depicted in green in Fig. 2). These mediating units are merely constructs to schematize the spatial-frequency processing of the rotundal neurons. We model the response of a rotundal neuron j by randomly sampling the responses of tectal supopulations to obtain
$$ r_{rc}^j(t) = \left[\int_{\tilde{A}} R_{rc}^j(\mathbf{k})M_{tc}(\mathbf{k},t) d\mathbf{k}\right]_+ , \label{rot_res2} $$
where \(R_{rc}^j(\mathbf{k})\) is the spatial Fourier transform of a function in real space Rrc(x), assuring that \(r_{rc}^j(t)\) is real valued. Inserting Eq. (2) in Eq. (4) yields
$$ r_{rc}^j(t)=\left[\int_{\tilde{A}} R_{rc}^j(\mathbf{k}) \left(\overset{n_{tc}}{\sum_{i=1}} R_{tc}^i(\mathbf{k})r_{tc}^i(t) \right)d\mathbf{k}\right]_+ $$
$$ \label{eq6} \;\;\;\;\;\;\;\;\;= \left[\overset{n_{tc}}{\sum\limits_{i=1}}\left(\int_{\tilde{A}} R_{rc}^j(\mathbf{k})R_{tc}^i(\mathbf{k})d\mathbf{k}\right)r_{tc}^i(t)\right]_+ $$
$$ \;\;\;\;\;\;\;\;\;= \left[\overset{n_{tc}}{\sum\limits_{i=1}} w_{ij}r_{tc}^i(t)\right]_+ \label{sum_weight_tectum_rotundus} $$
$$ w_{ij}= \int_{\tilde{A}} R_{rc}^j(\mathbf{k})R_{tc}^i(\mathbf{k})d\mathbf{k} \label{weight_tectum_rotundus} $$
is the connection strength of tectal neuron i and rotundal neuron j. Hence, according to our model, the connectivity pattern at the tecto-rotundal projection is determined by the receptive-field structure of tectal and rotundal neurons via Eq. (8). Consequently, within our model, function—expressed in neuronal response properties—is directly related to network connectivity. This is a characteristic feature of the model which may provide the opportunity in the future to test the underlying assumptions of the model directly.

The choice of the functional form of the projection can be motivated as follows. First, reconstruction of the stimulus at the tecto-rotundal projection ensures that at each layer of the pathway the stimulus can be encoded using the same number of neurons, as shown in Section 2.4, instead of requiring an increasingly growing number of neurons along the pathway. Second, spatial frequencies are “exposed” at the projection, allowing spatiotemporal filtering to be employed, e.g. to obtain velocity sensitivity, without requiring periodic receptive fields in accordance with experimental observation.

We reconstruct the visual input through linear superposition of the responses of nrc rotundal neurons, giving
$$ M_{rc}(\mathbf{x},t) = \overset{n_{rc}}{\sum\limits_{j=1}} R_{rc}^j(\mathbf{x}) r_{rc}^j(t) , $$
where \(R_{rc}^j(\mathbf{x})\) is the inverse Fourier transform of \(R_{rc}^j(\mathbf{k})\). In the Section 3.1 we show by means of computer simulations that for large numbers of tectal and rotundal cells
$$ M_{rc}(\mathbf{x},t) \approx I(\mathbf{x},t) \quad {\rm{(up\,\,to\,\, a\,\, scaling\,\, factor)}}. $$

2.2 Motion processing with tectal and rotundal neurons

So far, we have described how spatial visual data is organized in our model of the retino-tecto-rotundal pathway. We have defined tectal subpopulations representing global Fourier components of the visual input. However, tectal and rotundal neurons also show selectivity for motion attributes. For example, tectal neurons have been shown to be selective for moving stimuli, while they are only weakly selective for direction of motion (Troje and Frost 1998). This property of tectal neurons might have its origin in the synaptic properties of tectal neurons that promote suppression for static stimuli (Luksch et al. 2004; Khanbabaie et al. 2007), and/or in retinal preprocessing that enhances stimulus contrast. Hence we preprocess the visual input with a high-pass filter (see Section 2.5) to account for spatiotemporal contrast enhancement effects, without going into more detail here.

In the retino-tecto-rotundal pathway, pronounced selectivity for direction of motion is observed for rotundal neurons. In previous work, it was shown that motion processing can be performed in global Fourier space using spatial-frequency-dependent temporal filters (Dellen et al. 2007). We assume that directional temporal filtering takes places at the interface between tectal and rotundal neurons. In the model, intermediate rotundal units temporally filter the input from the associated tectal subpopulation. We can write the response of a rotundal neuron j selective for a velocity v as
$$ r_{rc,\mathbf{v}}^j(t) = \left[\int_{\tilde{A}} R_{rc}^j(\mathbf{k})\tilde{M}_{tc,\mathbf{v}}(\mathbf{k},t) d\mathbf{k}\right]_+ , \label{temp_filtering} $$
$$ \tilde{M}_{tc,\mathbf{v}}(\mathbf{k},t) = \int T_{\mathbf{v},\mathbf{k}}(t-t') M_{tc}(\mathbf{k},t') dt' , $$
where Tv,k(t) is a temporal filter selective for a temporal frequency ω = k·v, which implements the motion constraint equation (Adelson and Bergen 1985; Barron et al. 1994). This equation states that all the nonzero power associated with a translating 2D pattern lies on a plane through the origin in Fourier space, whose orientation is determined by the pattern velocity vector. The pattern velocity itself can be derived from the nonzero Fourier components by finding the velocity for which the constraint lines of the Fourier components intersect.
Inserting Eq. (2) in Eq. (11) yields
$$ r_{rc,\mathbf{v}}^j(t)\!=\!\! \left[\int_{\tilde{A}} R_{rc}^j(\mathbf{k})\left(\int T_{\mathbf{v},\mathbf{k}}(t-t') \times\, \left(\overset{n_{tc}}{\sum_{i=1}} R_{tc}^i(\mathbf{k})r_{tc}^i(t')\right) dt' \right) d\mathbf{k}\!\right]_+ $$
$$ \;\;\;\;\;\;\;\;\;\;\;=\!\! \left[\overset{n_{tc}}{\sum\limits_{i=1}}\!\int\!\! \left(\!\int_{\tilde{A}} \!R_{rc}^j(\mathbf{k}) T_{\mathbf{v},\mathbf{k}}(t\!-\!t') R_{tc}^i(\mathbf{k})d\mathbf{k}\!\right)\!r_{tc}^i(t') dt'\! \right]_+ $$
$$ \;\;\;\;\;\;\;\;\;\;\;=\!\! \overset{n_{tc}}{\sum\limits_{i=1}} \int w_{ij}(t-t')r_{tc}^i(t') dt' , \label{compute_weights} $$
$$ w_{ij}(t) = \int_{\tilde{A}} R_{rc}^j(\mathbf{k}) T_{\mathbf{v},\mathbf{k}}(t) R_{tc}^i(\mathbf{k})d\mathbf{k} $$
is a changing effective (functional) connectivity specifying the connection of tectal neuron i and rotundal neuron j.
To date, experimental evidence is insufficient to determine the precise properties of the temporal filtering. In this paper, we use the function
$$ T_{\mathbf{v},\mathbf{k}}(t)\!=\! \int \!\exp\left[i\omega' t\right] \exp\left[-(\omega' -\mathbf{k}\cdot \mathbf{v})^2/\xi|\mathbf{k}|^2\right] d\omega' \label{temp_filt} , $$
if t ≥ 0 and zero otherwise. The function \(\exp[-(\omega -\mathbf{k}\cdot \mathbf{v})^2/\xi|\mathbf{k}|^2]\) is a Gaussian of width ξ. This functional form allows us to adjust how strictly the motion constraint equation is enforced (by varying ξ). The parameter ξ has dimensions deg2/s2. The temporal filter contains a spatial-frequency-dependent weighting term, which ensures that the same number of cycles is sampled for each spatial frequency.
According to our model, a population of rotundal neurons selective for velocity v segments the part of the image moving with the corresponding velocity. Reconstruction of the segmented entity in real space can be achieved by summing over all rotundal responses selective for the velocity v, giving
$$ M_{rc,\mathbf{v}}(\mathbf{x},t) = \sum\limits_{j} R_{rc}^j(\mathbf{x})r_{rc,\mathbf{v}}^j(t). \label{mot_sel} $$

2.3 Extracting local velocity from rotundal responses

In Section 2.2, we proposed that moving entities can be segmented by means of velocity-selective rotundal model neurons based on Eq. (18). In Section 3.4, we will support this proposition with computer simulations. Of course, real image sequences exhibit complex patterns of motion involving accelerated motions, including rotation. It is thus desirable to compute a local-velocity field (or optic-flow field) of the visual input, in which a velocity estimate is assigned to each point in the image sequence. In this subsection, we formulate a neural model that permits local-velocity estimates to be derived from populations of velocity-selective rotundal neurons, following up with an algorithmic implementation of the model.

The computation of local velocity requires a joint representation of velocity and position. However, rotundal neurons are selective for velocity but not for position. In the Section 3.1, it is demonstrated that position can be retrieved from the rotundal responses by integrating the responses of certain subpopulations (see also Eqs. (910) and Eq. (18)). We assume that this operation is implemented by neurons positioned at a higher level in the visual pathway, which we call C1 neurons. These neurons might be located in the caudal ectostriatum (Gu et al. 2002; Nguyen et al. 2004).

The response of a C1 neuron o jointly selective for velocity v and position x is defined by
$$ r^o_{c1,\mathbf{x},\mathbf{v}}(t)= \left[\sum\limits_{j} R_{rc}^j(\mathbf{x})r_{rc,\mathbf{v}}^j(t)\right]_{+/-} $$
$$ \;\;\;\;\;\;\;\;\;\;\;\;\;=\left[M_{rc,\mathbf{v}}(\mathbf{x},t)\right]_{+/-} \label{c1} , $$
where the idea expressed in Eq. (18) has been adapted to C1 neurons. Here, we use a full rectification with [a] + / −  = [a] +  if a ≥ 0 and [ − a] +  otherwise. Constructive interference, measured by taking the absolute value, i.e. full rectification, of the superposed weighted responses of velocity-selective rotundal neurons, leads to a joint selectivity for position and velocity.
We further implement a spatiotemporal smoothing by combining C1 responses at a secondary stage C2. Thus we obtain
$$ r^o_{c2,\mathbf{x},\mathbf{v}}(t)= \int R_{c2}(\mathbf{x}-\mathbf{x}') r^o_{c1,\mathbf{x}',\mathbf{v}}(t) d\mathbf{x}' , \label{c2} $$
$$ R_{c2}(\mathbf{x}')=\exp\left(-\mathbf{x}'^2/\alpha^2\right) \label{c2_b} $$
where α is a smoothing parameter with dimension deg. This computational scheme is illustrated in Fig. 3. The smoothing operation at the secondary stage could alternatively be implemented by horizontal interactions between C1 neurons instead of a two-stage neural network.
Fig. 3

Extracting local velocity from rotundal responses. C1 neurons integrate the responses of rotundal subpopulations according to Eq. (20), generating joint selectivity for position and velocity (depicted in red). C2 neurons average the responses of C1 neurons of similar selectivity

We now can assign a velocity ve(x,t) to each point of the input sequence I(x,t) by finding the C2 neuron among the C2 subpopulation selective for x that shows the strongest response, thereby providing the local velocity estimate
$$ \mathbf{v}_e(\mathbf{x},t)=\text{arg}\left\{\max\limits_{\mathbf{v}}\left[r_{c2,\mathbf{x},\mathbf{v}}\right]\right\} \label{v_e} . $$
Note that this step is only introduced to obtain a representation of local velocity, which allows evaluating the performance of the motion-processing system both qualitatively and quantitively. In the animal brain, similar operations may be implemented in order to generate motor output.
In the limit of large tectal and rotundal populations, we make the substitutions
$$ M_{tc}(\mathbf{x},t)=F\left[I(\mathbf{x},t)\right] $$
$$ M_{rc,\mathbf{v}}(\mathbf{x},t)=\tilde{F}\left[T_{\mathbf{v},\mathbf{k}}(t)\ast_t F[I(\mathbf{x},t)]\right] , $$
in Eqs. (11) and (20), respectively. The symbol ∗ t denotes a temporal convolution and \(\tilde{F}\) is the inverse spatial Fourier transformation. Keeping in mind that the total rectification performed in Eq. (20) is mathematically equivalent to taking the absolute value of Mrc,v(x,t), Eq. (23) can then be replaced by
$$\mathbf{v}_e(\mathbf{x},t){\kern1pt} =\! \text{arg}\!\left\{\max\limits_{\mathbf{v}}\!\left[\left|\!\tilde{F}\left[\!T_{\mathbf{v},\mathbf{k}}(t)\ast_t F\!\left[I(\mathbf{x},t)\right]\right]\right|\!\ast_{x} \exp\!\left(-\mathbf{x}^2/\alpha^2\right)\!\right]\right\}\! , \label{final} $$
which constitutes an algorithm for the computation of optic flow utilizing global spatiotemporal filters. The symbol ∗ x denotes a spatial convolution. It has been demonstrated recently that algorithms utilizing global Fourier transformations for velocity estimation are not impaired by uncertainties arising in algorithms that utilize a local measurement window (Dellen and Wörgötter 2008). Furthermore, a confidence measure can be defined as
$$ c(\mathbf{x},t) \!= \!\left|\tilde{F}\left[T_{\mathbf{v}_e,\mathbf{k}}(t)\ast_t F[I(\mathbf{x},t)]\right]\right|\ast_{x} \exp\left(-\mathbf{x}^2/\alpha^2\right) , $$
upon which a threshold τr can be applied to select only the more reliable velocity estimates—which is a common strategy in computer vision (Barron et al. 1994).

2.4 Connectivity of the tecto-rotundal projection

The weight (connection strength) of a connection between a tectal and rotundal neuron is defined via Eq. (8). If the weight is zero or sufficiently close to zero, the connection can be considered as non-existing. Hence, according to our model, the number of connections is influenced by the choice of the model parameters (see also Section 2.6).

The total number of (non-zero) connections is also bounded by the upper limit ntc of the sum of Eq. (7). To assure proper transmission of the spatial information of the stimulus, ntc is assumed to be large, which implies that a rotundal neuron makes connections with many tectal cells. Experiments (in the cerebral cortex) indicate that a neuron receives input from about 104 neurons (Koch 1999; Pakkenberg et al. 2003). The large dendritic fields of rotundal neurons may suggest an even higher number for the tecto-rotundal projection. However, we will show in the following that the number of required connections can be decreased by creating rotundal subpopulations which receive input from an exclusive subset of tectal neurons, distributed throughout the entire tectum.

Let us assume a rotundal neuron j receives only input from a subpopulation Ps of nc tectal neurons. Hence, we write
$$ r_{rc}^{j,s} = \left[\int R_{rc}^{j,s}(\mathbf{k})A^s(\mathbf{k})d\mathbf{k}\right]_+ $$
$$ A^s(\mathbf{k}) = \sum\limits_{l\in P_s} R_{tc}^l(\mathbf{k}) r_{tc}^l . $$
For large number ntc of tectal neurons, the following equations hold
$$ F[I(\mathbf{x},t)]=\overset{n_{tc}}{\sum\limits_{i=1}} R_{tc}^i(\mathbf{k}) r_{tc}^i $$
$$ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;= \overset{n_q}{\sum\limits_{s=1}} \sum\limits_{l\in P_s} R_{tc}^l(\mathbf{k}) r_{tc}^l $$
$$ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;=\overset{n_q}{\sum\limits_{s=1}} A^s(\mathbf{k}) , \label{eq_n_i} $$
if the subpopulations \(\{P_1,P_2,...,P_{n_q}\}\) are mutually exclusive and ncnq = ntc is fulfilled. Here, nq denotes the number of subpopulations. It can be shown by computer simulations that for large numbers of na of rotundal neurons, one can write (in some approximation)
$$ A^s(\mathbf{k}) = \overset{n_a}{\sum\limits_{j=1}} R_{rc}^{j,s}(\mathbf{k}) r_{rc}^{j,s} \label{eq_a_s} . $$
Inserting Eq. (33) in Eq. (32), we obtain
$$ F[I(\mathbf{x},t)]= \overset{n_q}{\sum\limits_{s=1}} \sum\limits_{j\in P'_s} R_{rc}^{j,s}(\mathbf{k}) r_{rc}^{j,s} , $$
where Ps is a subpopulation of na rotundal neurons receiving exclusively input from tectal subpopulation Ps. Hence, we have a total number of nrc = nanq rotundal neurons. According to this derivation, the number of connections can be decreased without impairing function if a proportional amount of rotundal subpopulations is created, each subpopulation receiving input from the same group of tectal neurons. Assuming a fixed total number of tectal neurons ntc and na constant, then system performance is constant if
$$ n_{c}n_{rc}=n_{a}n_{tc}=\text{const}, $$
i.e. the number of required tectal neurons within each subpopulation can be decreased by increasing the total number of rotundal neurons (by creating more subpopulations). Using double-tracer injections in the nucleus rotundus, Marin et al. (2003) discovered that the tecto-rotundal projection is interdigitating, i.e. intermingled sets of tectal neurons terminate in separate regions within one subdivision of the nucleus rotundus. According to our model, this architecture might have its origin in connectivity constraints which have to be compensated by creating rotundal subpopulations (within a subdivision). These results further demonstrate that incomplete implicit stimulus reconstruction at the tecto-rotundal projection has to be compensated at the rotundal level by increasing the total number of neurons if the entire stimulus is to be transmitted, favoring an architecture as proposed by Eq. (8) for efficiency arguments.

2.5 Preprocessing

We preprocess the image sequence with a high-pass filter. This step accounts for change sensitivity at the tectal level. The high-pass (Butterworth) filter is defined in Fourier space as
$$ \Psi_f = 1/\left[1+\tau_f/\left(k_x^2 + k_y^2 + f_sk_t^2\right)\right] \label{butterworth} , $$
where τf is a threshold parameter with dimension deg − 2 and kx, ky, and kt are the frequencies of the image sequence, and \(f_s=\text{s}^2/\text{deg}^2\). This filter enhances the spatiotemporal contrast of the image sequence.

2.6 Tectal and rotundal receptive-field parameters

The parameters of the model are specified by defining the receptive fields of tectal and rotundal model neurons. The receptive field of a tectal neuron is generated by first creating points in the area A with a uniform probability of 0.1 points/deg2. This is done by tiling the potential receptive-field area into small surface elements of 1 deg by 1 deg. With probability 0.1, we locate a point (corresponding to a dendritic ending) in that element. This is repeated for all surface elements. Ideally, this should be done for elements of 1/n deg by 1/n deg and probability 0.1/n with n→ ∞. In Fig. 4(a), the histogram of nearest neighbor distances between the created points is shown, representing the distribution of dendritic endings of tectal cells. A visual angle of 1 deg corresponds approximately to 100 μm on the tectal surface (Mahani et al. 2006). The resulting distribution of points (corresponding to dendritic endings) is thus in a realistic range. Having specified the point distribution (or dendritic endings), the corresponding weights to each point are determined by sampling a uniform distribution between − 0.5 and 0.5, as depicted in Fig. 4(b). The corresponding spatial pattern is presented in Fig. 4(c). The corresponding energies of the spatial Fourier components are shown in Fig. 4(d).
Fig. 4

Receptive-field properties of model neurons. (a) Histogram of nearest-neighbor distances between points defining the receptive field of tectal neurons. (b) Histogram of the corresponding weight distribution. (c) Spatial pattern of points in real space used to generate the real-space receptive field of tectal and rotundal model neurons (only a quarter of the pattern is shown for better display). (d) Power spectrum of the spatial Fourier space of the receptive field shown in (c). (e) Histogram of the weight distribution of tecto-rotundal connections according to Eq. (8). (f) Alternative choice of spatial pattern of tectal and rotundal receptive fields in real space assuming sparseness in Fourier space (only a quarter of the pattern is shown for better display). (g) Power spectrum of the spatial Fourier space of the receptive field shown in (f). (h) Histogram of the weight distribution of tecto-rotundal connections according to Eq. (8). Most weights are zero

The receptive fields of rotundal neurons are defined in spatial Fourier space and are generated by taking the spatial Fourier transform of a point distribution identical to the one used to generate tectal receptive fields (Fig. 4(c)). To our knowledge, there is no conclusive data available about the receptive-field structure of rotundal neurons. Using Eq. (8), we compute the distribution of weights representing the connection strength between a tectal and rotundal neuron. The histogram of the weights is shown in Fig. 4(e). Weights range between − 0.5 and 0.5 with a maximum at zero. Hence, for the chosen receptive fields, only a small fraction of connections are strong. This is in accordance with experimental data (Marin et al. 2003).

Our knowledge about the precise receptive-field properties of tectal neurons is limited. Tectal dendritic-field properties only provide an approximation of tectal receptive fields (Troje and Frost 1998; Mahani et al. 2006; Schmidt and Bischof 2001). However, potential hidden structures in tectal and rotundal receptive fields may have a considerable impact on tecto-rotundal connectivity. Experimental data by Schmidt and Bischof (2001) indicates that the sparse and spotty receptive fields of tectal neurons contain substructures, which are however not yet sufficiently quantified to draw further conclusions with respect to our model. Receptive-field parameters have also been measured for neurons in the superior colliculus (Prevost et al. 2007; Mooney et al. 1988), the mammalian homolog of the optic tectum, but mainly in superficial layers. Future measurements of responses of deep tectal neurons to grating stimuli, i.e. spatial frequency, would allow us to further constrain the model. For example, sparseness in the spatial Fourier space (instead of sparseness in real space) of tectal and rotundal receptive fields has a strong impact on the connectivity pattern predicted by the model. In Fig. 4(f, g) the corresponding spatial patterns of the receptive fields are shown in real space and in Fourier space, respectively. Most weights defining the tecto-rotundal projection are zero, resulting in a sharp peak in the histogram (Fig. 4(h)).

It is important to keep in mind that our model of the retino-tecto-rotundal pathway is completely defined through Eqs. (1)–(18). Experimentally measurable quantities, such as the receptive-field structure of tectal and rotundal neurons, are parameters to the model that can be adjusted according to current knowledge. Predictions of the model such as the distribution of weights of the tecto-rotundal connection are necessarily influenced by the parameter values.

2.7 Error measurements

When the true velocities of a sequence of images of sizes m× n are given, error measures can be computed to quantify the performance of the algorithm. According to Barron et al. (1994), the angular error is defined as
$$ \bar{E}=\sum\limits_{\mathbf{x}}E(\mathbf{x})/n m , $$
$$ E(\mathbf{x})=\arccos\left\{\left(|\mathbf{v}(\mathbf{x})\cdot\mathbf{v}_e(\mathbf{x})|^2 \!+\! 1\right)^{-1/2}/\left[\left(|\mathbf{v}(\mathbf{x})|^2 \!+\! 1\right)\cdot\left(|\mathbf{v}_e(\mathbf{x})|^2 \!+\! 1\right)\right]^{-1/2}\right\}, $$
where ve is the estimated velocity and v is the true velocity.

3 Results

In this section, response properties of tectal and rotundal model neurons are computed using computer simulations using the model of the retino-tecto-rotundal pathway introduced in Section 2. We choose parameters τf = 0.2 deg2 and ξ = 0.6 frame2/deg2 for the spatiotemporal filters of the retino-tecto-rotundal pathway (see Eqs. (17) and (36)). The 2D spatial functions, defining the receptive fields of tectal and rotundal neurons, Rtc(x) and Rrc(x), respectively, are specified in Section 2.6. To obtain an intensity distribution with a zero mean for each image sequence, we subtract the mean intensity value from I(x,t). The rectification threshold τ (see Eq. (1)) is set to zero. We further show in Section 3.6 that local velocity fields can be computed from the output of model of the retino-tecto-rotundal pathway using the computational scheme derived in Section 2.3. For the C2 neurons, we choose a smoothing parameter α = 10 deg (see Eqs. (26) and (22)). The model parameters are not altered unless indicated otherwise.

3.1 Organization of spatial information in the tectal and rotundal cell populations

Computer simulations are used to predict the responses of the model tectal and rotundal cell populations. First we demonstrate that the tectal responses approximate the Fourier transform of an input image of size 95×128 pixels2, assuming that each pixel corresponds to 1 deg of visual angle. For computational reasons, the resolution is constrained to a maximum of 1 deg, which is below the resolution of the avian visual system. The receptive field of each model tectal cell is generated according to Section 2.6 assuming sparseness in real space (see Fig. 5(a–c)). For practicality, the receptive field size was chosen such that the visual field is tiled into equally large parts. This detail has no effect on the results of the computation. For the tectal-cell populations, we computed the representation of the population response in Fourier space (Eq. (2)) and calculated the correlation coefficient of the Fourier transform F[I(x)] of the input image and Mtc(k). In Fig. 5(a), the correlation coefficient is plotted as function of the number of tectal cells. For large numbers of cells, a correlation coefficient above 0.95 is obtained. The squared correlation coefficient measures the correlation (or amount of variance reconstructed) and has a value larger than 0.9, demonstrating that the original image can be largely retrieved from the tectal populations. The input image, shown as the left panel of the inset in Fig. 5(a) and also shown in Fig. 10(b), is a snapshot of the so-called taxi sequence. The reconstructed image from the tectal responses is juxtaposed as the right panel of the inset.
Fig. 5

Stimulus reconstruction from tectal and rotundal population responses. (a) Correlation coefficient of the Fourier space representation of the tectal neuronal populations (Eq. (2)) and Fourier transform of the visual input plotted for increasing numbers of tectal neurons. For large numbers of cells, a correlation coefficient above 0.95 is obtained, corresponding to a correlation larger than 0.9 (see text). The reconstructed image from the tectal population, presented in the inset (right panel), may be compared with the original image (left panel of inset). (b) Correlation coefficient of the real-space representation of the rotundal neuronal populations (Eq. (6)) and the visual input, plotted against the number of rotundal neurons. For large numbers, we achieve correlation coefficients larger than 0.95. (c) Correlation coefficient of the original and the reconstructed image for different levels of noise fn in the connectivity (see text)

The receptive field of rotundal neurons is generated according to Section 2.6 assuming sparseness in real space (see Fig. 4(c–e)) having a size of 95×128 deg. The representation of a representative rotundal receptive field in Fourier space is shown in Fig. 4(d). We can now reconstruct the visual input from the rotundal responses by computing Mrc(x,t), but replacing Mtc(k) by F[I(x)]. The correlation coefficient of the input image I(x) and Mrc(x) is then calculated and plotted in Fig. 5(b) as a function of the number of rotundal cells. For large numbers of cells, a correlation coefficient above 0.9 is obtained, demonstrating that the original image can be largely retrieved from the rotundal populations.

We investigate how noise in the connectivity between tectal and rotundal neurons affects the quality of reconstruction. According to Eq. (2), each tectal cell contributes to each supopulation k with a weight Rtc(k). To each of these weights, we add a noise term \(f_n n_r \overline{|R_{tc}(\mathbf{k})|}\) where fn is the noise factor and nr is random number drawn from a Gaussian distribution with a standard deviation of 1. The correlation coefficient of the reconstructed image and the original are plotted in Fig. 5(c). Noise in the connectivity impairs image-reconstruction performance for noise terms being in the range of the average absolute connection strength, approximated by \(\overline{|R_{tc}(\mathbf{k})|}\). However, for noise level of 50% of the average absolute connection strength, i.e. fn = 0.5, performance drops only by about 10%. For a noise level of 400%, i.e. fn = 4, performance decreases by about 40%. These values suggest robustness and graceful degradation of performance with network damage, which is typical for coarse-coding schemes (Hinton et al. 1986).

3.2 Response properties of tectal neurons

We also simulate the response of tectal neurons to various stimulus attributes, i.e. spatial frequency, orientation, and speed. The same parameters are used for the tectal neuron as in the previous subsection. The image sequence contains 20 frames of size 100×100 deg2. We choose the size of tectal receptive fields to be 50×50 deg2. The remaining receptive field parameters are not altered. The size of the rotundal-cell receptive fields is chosen as 100×100 deg2. In the following, velocities are defined in deg/frame for convenience, and typical speeds of objects in this paper are chosen to be in the range of 0 to 5 deg/frame. The frame rate of the motion sequence allows translating the velocity units to deg/s. Typical frame rates are 24 frames/s. For example, a speed of 1 deg/frame corresponds to a speed of 24 deg/s.

First, we investigate the response of model tectal neurons to a grating that moves in the x direction with 1 deg/frame as a function of the spatial frequency of the grating. While the spatial-frequency tuning curves of individual model neurons exhibit multiple but random peaks (Fig. 6(a), left panel, blue lines). The mean response of the population (averaged over 200 neurons) shows a slight preference for high spatial frequencies (Fig. 6(b), thick red line). This result can be attributed to the preprocessing of the image sequence with a high-pass filter and conforms with the observation that tectal responses are suppressed by static stimuli. We calculate the number of peaks and the corresponding peak heights (above the mean) of the individual tuning curves. The histogram of the number of peaks and peak heights are plotted in the right upper and lower panel. Multiple peaks are commonly observed.
Fig. 6

Response properties of tectal model neurons. (a) Spatial frequency tuning and (b) orientation tuning each summed over a time interval of 20 frames and plotted for 9 neurons (blue curves). The average tuning curve obtained from 200 model cells is plotted in thick red. The corresponding histograms of peak number and peak height (above mean) are depicted in the upper and lower right panels, respectively. (c) Responses to a square moving along the x-axis with different constant speed plotted as a function of x-velocity for 9 model neurons in blue and average tuning curve from 200 model in thick red. The histogram of the speed tuning index is depicted in the right panel

Next, we calculate the response of tectal neurons to a grating of a spatial frequency k = 0.2 cycles/deg moving with 1 deg/frame for different orientations of the grating. While the tuning curves of individual neurons show multiple peaks (Fig. 6(b), blue lines), the averaged tuning curve does not show selectivity for orientation (Fig. 6(b), thick red line). The histogram of the number of peaks and the peak heights of the corresponding tuning curves are given in the right upper and lower panel, respectively.

Lastly, we compute the response of tectal model neurons to a solid square of size 10×10 deg2 moving along the x axis for different constant velocities. Individual tuning curves show sensitivity to stimulus speed and occasionally weak directional selectivity (Fig. 6(c), blue lines). The averaged tuning curve exhibits strong sensitivity to stimulus speed, but does not show any directional selectivity (Fig. 6(c), thick red line). This result is in accordance with experimental data (Troje and Frost 1998). We calculated the speed tuning index of a tectal neuron by computing [r1 − r0]/r1 where r1 and r0 are the respective values of the tuning curve at vx = 1 deg/frame and vx = 0 deg/frame. All speed tuning indices smaller than zero were set to zero. The resulting histogram of the speed tuning index is presented in the right panel of Fig. 6(c). Most neurons of the population have a speed-tuning index close to 1.

To our knowledge, no experimental tuning curves to frequency and orientation of spatial Fourier components are available for deep tectal neurons of the tectofugal pathway. Spatial-frequency tuning has been measured only for neurons in superficial layers of the optic tectum. However, there is experimental evidence that deep tectal neurons are only weakly selective for direction of motion, but strongly tuned to speed (Troje and Frost 1998; Letelier et al. 2002).

3.3 Response properties of rotundal model neurons

Computer simulation is applied to study the response of a motion-sensitive rotundal model neuron selective for a velocity v = (1,0) deg/frame to a solid square of size 10×10 deg2 moving along the x axis at different constant velocities. The parameters of the tectal neurons are chosen as in the previous subsection. The rotundal-cell receptive fields are again of size 100×100 deg2. For the temporal filter of the rotundal neurons, we take ξ = 0.6 frame2/deg2. The individual velocity tuning curves are presented in Fig. 7(a) (blue lines). Most of the tuning curves exhibit a pronounced sensitivity to direction of motion. This is reflected in the average tuning curve (based on 200 neurons), depicted in thick red. Directional selectivity has been observed for a certain class of rotundal neurons (Wang and Frost 1990; Wang et al. 1993).
Fig. 7

Response properties of rotundal model neurons. (a) Velocity tuning curves of individual rotundal model neurons, selective for a velocity in the x-direction of 1 deg/frame, are depicted in blue. The average tuning curve is plotted in thick red. Here, preprocessing is included in the model. (b) Same as a, but with no preprocessing. (c) Histogram of the peak height for the tuning curves of rotundal neurons (with preprocessing). (d) Histogram of the peak number obtained from the tuning curves of rotundal neurons (with preprocessing). (e) Histogram of the direction-tuning index obtained from the tuning curves of rotundal neurons (with preprocessing)

For comparison, we compute the tuning curves for rotundal neurons without including the preprocessing step at the tectal level (Fig. 7(b)). The resulting tuning curves show strong selectivity for direction of motion, demonstrating that this rotundal property is not a consequence of the preprocessing operation. The selectivity for direction stems instead from the temporal filtering taking place at the tecto-rotundal projection.

To date, there is not sufficient data available to compare our results quantitatively to real velocity-tuning curves. Experiments however have shown that neurons in the ventral subdivision of the nucleus rotundus are sensitive to the direction of motion (Wang et al. 1993; Wang and Frost 1990).

For the rotundal population of Fig. 7(a), we computed the peak height and the number of peaks of the tuning curves. The histograms are plotted in Fig. 7(c, d). Most tuning curves exhibit only a single peak, however double peaks are observed as well. We further calculated the direction tuning index of each rotundal neuron as [r1 − r − 1]/r1 where r1 and r − 1 are the respective values of the tuning curve at vx = 1 and vx = − 1 deg/frame. Most neurons of the population are directionally selective, with a direction tuning index larger than 0.5 deg/frame (see Fig. 7(e)). A population of 12×104 tectal neurons was chosen for this simulation.

3.4 Motion segmentation through motion-sensitive rotundal subpopulations

Simulations are carried out to explore the responses of rotundal model neurons, that are selective for a velocity v = (1,0) deg/frame, to a stimulus consisting of a camouflaged random-dot square moving with a speed of 1 deg/frame to the right, in front of a random-dot background pattern moving with a speed of 1 deg/frame to the left. A schematic of the stimulus is given in Fig. 8(a). Motion segmentation in the proposed model is simulated by computing Mrc,v(x,t) for the input image sequence based on a tectal cell population of 12×104 neurons and a rotundal cell population of 8×104 neurons. The response Mr,v(x,t) of the rotundal population in real space is depicted in Fig. 8(b). The moving camouflaged random-dot square has been segmented with sharp boundaries and precise spatiotemporal detail.
Fig. 8

Motion segmentation. (a) Input image sequence showing a camouflaged random-dot square moving to the right in front of a random-dot background that moves to the left. (b) Segmented random-dot square reconstructed from a velocity-selective rotundal subpopulation response

3.5 Local velocity computation and the aperture problem

We first investigate the response pattern of a C2 model neuron population to a moving square stimulus of 3×3 deg size. The mean response values of C2 model neurons at the time point where the stimulus crosses their receptive-field center is plotted as a function of the preferred velocity of the C2 model neurons in Fig. 9(a). The curve peaks at vx = 2 deg/frame, which is the velocity of the stimulus. The position tuning curve of a C2 neuron to the same stimulus with respect to its receptive-field center is shown in Fig. 9(b). The responsive region of the C2 neurons is approximately 20 deg in diameter. If using a larger stimulus of 12×12 deg size, population tuning becomes less pronounced (see Fig. 9(c), solid line), however, the correct velocity estimate can still be extracted from the population response. If the area outside the responsive region of the model neuron is masked, the responses of the C2 neuron suggest an incorrect stimulus velocity of vx = − 2 deg/frame in the x-direction (see Fig. 9(c), dashed line). This demonstrates that distributed global processing of velocity and subsequent reconstruction of position allow velocities to be reconstructed locally without introducing an aperture. The parameter choices of the C2 model neurons have been τf = 0.2 deg2, ξ = 0.6 frame2/deg2, and α = 3 deg.
Fig. 9

Response properties of C2 neurons. (a) Mean response value of C2 model neurons as a function of their preferred velocity to a small square of 3×3 deg size moving along with vx = 2 deg/frame in the x-direction. The correct stimulus velocity can be derived form the population response by finding the C2 model neuron with the largest response. (b) Mean response value of a C2 model neuron as a function of the position (with respect to the receptive-field center) of the same stimulus. (c) For a larger square of 12×12 deg size, the mean response value of the C2 model neurons still peaks at the correct stimulus velocity (solid line), even though the square approximates the size of the responsive area of the C2 model neurons (see b). If the area outside the responsive region of the model neuron of approximately 20 deg in diameter is masked, the population response suggests an incorrect stimulus velocity of vx = − 2 deg/frame in the x-direction (dashed line)

3.6 Local-velocity fields from rotundal responses

In this section, we demonstrate that the proposed model enables the animal to compute optic flow from real image sequences. Using the algorithmic implementation of Eq. (26), we compute the local-velocity fields of four image sequences for parameter choices τf = 0.2 deg2, ξ = 0.6 frame2/deg2, α = 10 deg, and τr = 5. For practical implementation reasons, the temporal filter here has been chosen to be non-causal, which is expected to have only a minor effect on the results. The image sequences selected are benchmark examples commonly used in the machine-vision community (Barron et al. 1994).

In the SRI Sequence, a camera moves parallel to the ground plane along the x-axis (horizontal direction) in front of several trees. The velocities are as large as two pixels/frame, and the sequence contains 20 frames. A snapshot is depicted in the left panel of Fig. 10(a), while the right panel shows the estimated velocity field in the x-direction. The optic-flow field captures the predominant velocity pattern of the sequence and segments the images into foreground and background.
Fig. 10

Estimated local-velocity fields. (a) SRI sequence. (b) Hamburg taxi sequence. (c) Translating/diverging tree sequence. (d) Yosemite sequence

In the Hamburg taxi sequence, a street scene is shown with four moving objects: a taxi turning the corner, a car in the lower left driving from the left to the right, a van in the lower right driving from the right to the left, and a person walking in upper left. Image speeds of the four moving objects are approximately 1.0, 3.0, 3.0, and 0.3 pixels/frame, respectively. The sequence contains 20 frames. A snapshot is shown in the right panel of Fig. 10(b). Adopting the same parameters as for the SRI-sequence, our algorithm returns an optic-flow field in which the moving objects are clearly visible (Fig. 10(b), right panel). The velocity estimates are close to the true velocities of the objects.

The translating and diverging tree sequence are created by moving a camera sideways and towards an image of a tree, respectively. The algorithm returns flow fields with 97% density and angular errors of 1.19 dega for the translating-tree sequence and 3.83 dega for the diverging-tree sequence (Fig. 10(c)).

We also apply the algorithm to the well-known Yosemite sequence (Fig. 10(d)). Each frame of the Yosemite sequence has been generated by mapping aerial photography onto a digital-terrain map. Speeds in the lower left corner go up to four pixels/frame, while the clouds translate with about one pixel/frame to the right. The algorithm achieves an angular error of 7.73 dega everywhere except for the area of the clouds. In the cloud area, the true motion is unknown, since the clouds are undergoing Brownian motion and changing shape.

4 Discussion

We have presented a firing-rate model of the retino-tecto-rotundal pathway for the processing of Fourier-based motion. In this model, responses of tectal neurons are obtained by integrating the visual space over the receptive field of the neuron, which, in accordance with experimental data, is assumed to consist of random dots sparsely distributed over a large area of the visual space. We have established that despite of the lack of periodic structures, motion signals can be generated, giving rise to directionally-selective responses of neurons in the nucleus rotundus. Using biologically plausible model parameters, a characteristic distribution of direction-tuning indices for the rotundal population is predicted. Furthermore, spatial information is retained in the population response and can be retrieved at any stage of the processing stream. As a proof of concept, we showed that local velocity estimates may be derived from responses of the rotundal model neuron population through superposition of rotundal responses by a neural network. This includes the prediction of neurons jointly selective for position and velocity, potentially located in the caudal ectostriatum (Nguyen et al. 2004). Motion-sensitive neurons in the caudal ectostriatum receive input from the nucleus rotundus and have large receptive fields (Nguyen et al. 2004; Gu et al. 2002). The emergence of so-called hot spots within the excitatory receptive field of ectostriatal neurons might indicate the onset of position reconstruction (Gu et al. 2002). Using an algorithmic equivalent of the model, local-velocity fields of four real sequences featuring complex motions have been computed for a fixed set of parameters, demonstrating the feasibility of the approach.

Considering the large receptive fields of tectal and rotundal neurons, a distributed representation of spatiotemporal information is considered to be a plausible choice to describe motion processing in the retino-tecto-rotundal pathway. The model results demonstrate that high spatial acuity is indeed in agreement with the specific properties of this pathway. This is also in agreement with work on coarse coding by Hinton et al. (1986), who showed that a stimulus can be represented more accurately by a collection of neurons with broad response functions than by a collection of nuerons with more finely-tuned response functions.

So far, we confined the analysis and modeling to feedforward processes. However, neurons in the tecto-rotundal pathway exhibit contextual effects. For example, tectal responses to a moving stimulus are suppressed by a background moving in the same direction, and vice versa (Frost and Nakayama 1983; Sun et al. 2002). Contextual influences are thought to be mediated by lateral connections or through feedback from brain areas at a later processing stage (Nakayama 1985; Dellen and Wessel 2008). Our model of the retino-tecto-rotundal pathway might be extended to allow for such interactions. In the future, we aim to investigate the role of isthmo-tectal feedback on motion processing (Meyer et al. 2008). Further, there is evidence that certain classes of neurons in the nucleus rotundus compute various optical variables of looming objects (Wang et al. 1993; Sun and Frost 1998). The neuronal responses of these neurons could be modelled adequately by sampling over tectal subpopulations that encode radial spatial frequencies at the tecto-rotundal projection.

Representing the stimulus by distributed representations and performing motion processing in this global space offers specific advantages compared to local representations, such as representational efficiency, i.e. more entities can be encoded by the same number of neurons, and graceful degradation of performance in response to network damage or noise. For motion processing tasks, distributed representations have the advantage that local velocity responses obtained via superposition of responses of wide-field neurons are not constrained by the (measurement-window-induced) aperture problem (see Fig. 9), which is typically introduced when utilizing a small measurement window, i.e. small receptive fields. Hence, the proposed model and the respective optic-flow algorithm are fundamentally different from theories of motion processing culminating in the computation of optic flow that have been developed for the geniculocortical pathway in mammals (Adelson and Bergen 1985; Heeger 1988). In these models, velocity estimates are derived from simple and complex cells that feature periodically arranged on and off subunits, with each cell covering a local patch of the visual space. In our model, local velocity estimates arise from constructive interference effects in distributed representations. Constructive interference allows joint selectivity for position and velocity to arise, even though at previous processing steps velocity-sensitive rotundal neurons have received input from the entire visual field. The model shows that global transformations are not in conflict with the computation of local velocities.


The work has received support from the German Ministry for Education and Research (BMBF) via the Bernstein Center for Computational Neuroscience (BCCN) Göttingen under Grant No. 01GQ0430, the NIH/NEI, ROI EY015678, and the EU Project Drivsco under Contract No. 016276-2. JWC acknowledges support from Fundação para a Ciência e a Tecnologia of the Portuguese Ministério da Ciência, Tecnologia e Ensino Superior and Fundação Luso-Americana.

Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Copyright information

© The Author(s) 2009