Decision making can be regarded as a cognitive process of the evaluation and selection of a course of action from among various alternatives, through which a final choice is produced. Developing novel technologies for research on human behavior during complex cognitive processes such as decision making is drawing more attention nowadays.

Among other novel technologies, eyetracking techniques have been developed over recent decades as a window into people’s visual and cognitive processes. Eyetracking techniques have been applied to realms of behavior research such as image scanning (e.g., Noton & Stark, 1971), driving (e.g., Land & Lee, 1994), arithmetic (e.g., Suppes, 1990), analogy (e.g., Salvucci & Anderson, 2001), reading (e.g., Rayner, 1995), mental health (e.g., Bal et al., 2010), and memory recall (e.g., Zargari Marandi & Sabzpoushan, 2014), to name only a few. Eye movements are typically analyzed in terms of pauses over informative regions of interest, which are called fixations, and rapid movements between fixations, which are called saccades (Salvucci & Goldberg, 2000).

In the realm of neuroscience and psychology studies—as well as in the fields of industrial engineering and human factors, marketing and advertising, and computer science—eye movement research is of great importance, due to its informative features. Different sources of information that the subject is using, how frequently they are sampled, and how they affect decisions are some of the useful cues that eyetracking technology can reveal (Richardson & Spivey, 2004).

Eyetracking can reveal how effective an advertisement is. In this regard, the motivation to utilize an eyetracker in market research stems from the desire to understand consumer actions (Duchowski, 2007). Consumer behavior rests on the interplay between immediate perceptual factors and decision-making processes that can be seen in eye movement behavior. It was soon realized that eye movement technologies could be used to identify the factors that determine the amounts of attention allocated to different advertisements (Karslake, 1940). Such measures have predictive value, since the gaze durations to competing advertisements correlate with later product choice (Lohse, 1997; Treistman & Gregg, 1979).

The relationship between eye movements and cognitive processing has been employed as a tool to improve ergonomic design and computer interfaces (Kramer & McCarley, 2003). The emergence of gaze-contingent eyetracking has allowed eye movements to be further employed as a means to interact with such systems (Duchowski, 2002). On the one hand, since fixations reveal salient parts of the visual world to the user, eye movement information can be used to tailor visual displays; on the other hand, regions that are not fixated can be rendered at a far lower level of detail (Ohshima, Yamamoto, & Tamura, 1996; Santella & DeCarlo, 2002; Watson, Walker, Hodges, & Worden, 1997), and thus eyetracking can provide researchers with priceless computation shortcuts.

The surrounding visual world, as well as the underlying processes in individuals’ minds, are the driving factors of one’s eye movements. Uniquely poised between perception and cognition, eye movements provide invaluable tools for psychologists. It is crystal clear that the data source concerning the temporal dynamics and psychological processes involved in a subject’s eye movements is more rich and dynamic than the single data points provided by buttonpress reaction times (Richardson & Spivey, 2004). For better measuring of cognitive processes during decision making, eyetracking can be used in combination with common methods such as mouse-tracing, introduced by Franco-Watkins and Johnson (2011).

Among eyetracking techniques, electrooculography (EOG) is one that has remained less explored, due to numerous measurement problems, in the domain of cognitive science studies. Typically, alternative approaches such as video oculography (VOG) or infrared oculography (IROG) are applied because of their low cost and long-time stability. In the present work, EOG is used to introduce a novel paradigm to study consumers’ behavior. What is most novel about the methodology of this study is the use of an artificial neural network to map the vertical and horizontal EOG signals into calibrated two-dimensional gaze signals. The further contributions of this work include proposing a qualitative model for predicting subjects’ choices on the basis of eye movements during the task of selecting between two items.

Method

The experimental method was used to study decision-making process. An experiment was conducted for this study that included presenting visual stimuli to subjects and recording their eye movements and voices. The experiment was performed in a dim and quiet place. To allow distinct and near-real-world eye movements, visual stimuli were presented on a 116 × 66 cm screen by a video projector. The resolution of the presented pictures as visual stimuli was 640 × 480 pixels. The stimuli were designed and implemented using Adobe Flash software. Subjects were instructed on how to do the tasks through interactive multimedia instructions running on a computer before the experiment. Moreover, the instructions were also shown in text format before the task onset for each subject. The experiment included a calibration task before the main task. Subjects sat on an armchair and adjusted their positions such that their eyes looked straight ahead, corresponding to the center of the screen. Subjects were trained to stabilize their heads during the experiment by means of a fixation marker that was adjusted along with their chins. The subjects’ distance from the screen was between 160 and 170 cm.

Experiment

In this work, 24 pictures of two items, both from the same category, were shown for subjects to choose between within 10-s intervals. The items were from eight different categories that included sunglasses, armchairs, watches, glasses, buffet lightings, sneakers, cookware sets, and lamp shades. Each pair of items was shown three times. The first time, the items were shown without any economic information, such as prices or discounts. In the second presentation of each pair, price information was added. In the third presentation, a discount factor was added to one of the pairs. The two items were located on the left and right sides of each picture. Pictures from the same category were not presented consecutively—instead, a random sequence among the three presentations of the same pairs was considered instead.

Some constraints were considered to select the items for the experiment. The items were selected such that they were not from different brands, or their brands were not shown. Moreover, gender-related items such as clothes were not included among the items. These constraints were used to prevent item selections by the subjects based on gender- and/or brand-related biases. These biases might lead to differences between the subjects that were not due to mental process inequalities.

Subjects

Eleven subjects (five females and six males) volunteered for our experiment. They were 22 to 29 years old (M = 26.1, SD = 3.2). All of the subjects had normal or corrected-to-normal vision, and most were either undergraduate or graduate students from Iran University of Science and Technology.

Apparatus

Neurophysiologic signals were recorded and filtered using a Simulink application. For the present study, a g.USBAmp amplifier (g.tec Medical Engineering GmbH, Austria) was used. This device represents a multimodal amplifier for electrophysiological signals such as electroencephalography (EEG), EOG, electromyography (EMG), and electrocardiography (ECG). EOG signals were captured by five electrodes placed on the subject’s face and were recorded using a bipolar configuration. For the horizontal EOG, the electrodes were placed at the outer canthus of each eye, and for the vertical EOG, infra-orbital and supra-orbital electrodes were placed in line with the pupil of one eye. A ground electrode was placed on the earlobe. The sampling rate was adjusted to 256 Hz. High-purity gold (Au) electrodes with a resistance lower than 5 kΩ were used in this research.

Eye movements analysis

Eye movements can be analyzed using EOG signals. Before signal processing, a preprocessing stage is critical to suppress artifacts from EOG signals. Signal processing techniques help to remove these artifacts, such as noise and baseline drift.

EOG signal preprocessing

EOG signals, like other biological signals, are always affected by noise and baseline drift. Noise in EOG signals may have several sources, including electric power lines, measurement circuitry such as wires and electrodes, or other interfering physiological sources, such as EMG signals. Although EEG signals may also have effects on EOG signals, such effects are negligible, because EEG signal amplitude is low enough to be omitted in comparison with the EOG signal amplitude. In order to prevent EMG artifacts as much as possible, subjects were trained to avoid clenching their jaw during the experiment. To cope with the aforementioned artifacts, first of all, the EOG signals were low-pass filtered, with a cutoff frequency at 30 Hz integrated into the device. Moreover, an additional 50-Hz notch filter was applied to suppress power line noise. The notch filter was applied using G.tec’s application programming interface (API) for MATLAB.

As can be seen in Figs. 1 and 2, baseline drift is a slow change superposed on the EOG signal that is unrelated to eye movements. It has quite a few possible sources—for example, interfering background signals or electrode polarization (Gu, Meng, Cook, & Faulkner, 2001). An approach based on a wavelet transform was used to remove baseline drift (Tinati & Mozaffary, 2006). To do so, using Daubechies wavelets, an approximated multilevel one-dimensional wavelet decomposition at Level 12 was performed on each EOG signal component. A baseline drift estimation was acquired from the reconstructed decomposition coefficients (red curves in Figs. 1 and 2). The corrected signals with reduced drift offset (blue curves in Figs. 1 and 2) were obtained by subtracting this estimation from each original signal component.

Fig. 1
figure 1

Baseline drift removal from EOG signal in the horizontal channel (HEOG) for one subject, as an example

Fig. 2
figure 2

Baseline drift removal from EOG signal in the vertical channel (VEOG) for the same time window and the same subject as in Fig. 1

As is shown in Fig. 3, blink appears in the vertical EOG signal component in the form of overshoots in the signal amplitude. This undesirable effect on the vertical EOG signal component can be removed using a median filter, which is well-known for impulse noise removal (Krupiński & Mazurek, 2011). Several approaches have been introduced recently to allow for more plausible results than are produced by a median filter; however, using a median filter simply fulfilled the objectives of this work—which was checked by visual inspection of the vertical EOG signal component. The overall stages of signal processing performed in this work are shown in Fig. 4.

Fig. 3
figure 3

Using a median filter to remove blinks from the vertical channel of the EOG signal (VEOG) of one subject, as an example. Some of the blinks are circled manually.

Fig. 4
figure 4

Overall stages of signal processing performed in this work. First, vertical and horizontal components of the EOG signal were recorded. Second, these components—namely, HEOG and VEOG—were passed through a low-pass and a notch filter as a noise removal step; after that, baseline drift (BW) was removed using wavelets, and then a median filter was applied to remove blink effects from the VEOG. At last, the prepared signals were imported into an artificial neural network (ANN) for calibration

EOG signal calibration using an artificial neural network

It was necessary in this work to know the position of gaze on the screen during the experiment. To achieve gaze position from the EOG signal components, a calibration task was required. Thirty-two fixation points were presented sequentially for 2-s intervals in a sequence of different coordinates on the screen, and subjects were asked to look at them in order to do the calibration task. The arrangement of the positions where fixation points were presented was designed in such a way as to record saccadic eye movements with different amplitudes and directions. The fixation points were presented as light points, in yellow color on a black background, in positions to cover the whole screen. When the calibration procedure was done, the decision-making task was initiated.

Using the EOG signal components from the calibration procedure, it was possible to map the amplitudes of both the horizontal and vertical components of the EOG signal to their corresponding screen coordinates. To do so, a two-layered feed-forward artificial neural network with a Levenberg–Marquardt (LM) training function was utilized. The horizontal and vertical components of the EOG signals were used as the network inputs. A hidden layer including 40 neurons in the network architecture was considered. The output layer was composed of two numbers, the x- and y-coordinates on the screen, corresponding to the horizontal and vertical components of the EOG signal, respectively. The activation functions of the network are the sigmoid and linear functions for the hidden and output layers, respectively.

Sixty percent of the EOG signals during calibration were randomly selected to train the network. Twenty percent of them were used for validation, and the rest were used to test the network. After that, the mapping of the EOG signals to their corresponding coordinates was evaluated using mean squared errors (MSEs). The mean accuracy among the subjects was 87.9 %, with a standard deviation of 8.1 %. A sample of the EOG signals and their corresponding coordinates is shown in Fig. 5.

Fig. 5
figure 5

Example of vertical and horizontal EOG signals before (left) and after (right) calibration using artificial neural network

As can be seen in Fig. 5, the EOG signal exhibits a high degree of nonlinearities due to its inherent physiological characteristics, so much so that it requires a technique to estimate these variations in a proper way. The ability to model an unspecified nonlinear relationship between input patterns and output patterns is one of the most significant features of ANNs (Woo & Park, 2006). Therefore, by taking all of these into account, ANN is preferable to other approaches. However, it is an open issue to evaluate other means by which one could exhibit plausible results.

Experimental parameters

Sixteen parameters were used in this work to analyze the decision-making process. These parameters are listed in Table 1.

Table 1 List of parameters extracted from the experiment

Response time measurement

The voice signals of the subjects were used to measure their response times. The response time was the time within each picture presentation interval taken by the subjects to make their choices. Response time can be detected from the voice signal amplitude, because when subjects make their choice, a big change in voice signal amplitude appears.

Fixation duration evaluation

Fixations can be detected from the EOG signal components. When there are no considerable changes in the EOG signal amplitudes of both the horizontal and vertical components simultaneously, it can be inferred that a fixation is occurring. The fixation duration on the items can be measured from mapped the EOG signal components to their corresponding coordinates. The mapped data were then compared with the item coordinates. If they were within an item’s area on the screen and fixation occurred simultaneously, a fixation duration on that item would be found.

Results

Every parameter of Table 1 was measured for the subjects, including both males and females, to compare their differences. Moreover, some parameters were compared for three stages of presenting items.

Table 2 shows that subjects looked at their selected items at the selection moment. According to a t test, t(10) = 71.62, SE = 1.25, p < 0.0001, the correspondence rate of the subjects’ gaze at the selection moment with the selected item was significant. It can be seen that the first-watched item had no significant effect on the subject’s choice, because its choice probability is near 50 % according to Table 2. This chance level is also true for the case of the gaze correspondence between the chosen item and the item with the longest fixation duration.

Table 2 Gaze correspondences of the three stages to each other

Table 3 shows that economic factors had no effect on subjects’ choices. Perhaps this is because of primary assumption that the subject had enough money, which was told to the subjects before the experiment.

Table 3 Economic factors’ impacts on subjects’ choices

Figure 6 shows explicit differences between males and females in response times. On the basis of this difference, males tell their choices slightly (about 1 s) later than females. However, this difference was not statistically significant, because the results from t tests for the first, second, and third stages show p ≈ .09, p ≈ .14, and p ≈ .15, respectively. These probabilities are greater than the significance level of α = .05. Moreover, we observed no significant difference between subjects’ response times at the three stages.

Fig. 6
figure 6

Comparison of mean response times for the three stages. Error bars denote the sample error of the means

It can be seen from Fig. 7 that at the first stage—with no economic factor—there was no significant difference between males and females in switching frequency between the two items. The mean switching frequency lies between 2 and 3 times.

Fig. 7
figure 7

Comparison of mean switching frequencies at the three stages. Error bars denote the sample error of the means

According to Fig. 8, the mean fixation duration on selected items at each of the three stages is almost fixed. Another common feature from Fig. 8 across the three stages is that the mean fixation duration on selected items is slightly higher among males than females.

Fig. 8
figure 8

Comparison of mean fixation durations on selected items at the three stages. Error bars denote the sample error of the means

Figure 9 illustrates that the mean fixation duration on nonselected items increases with the addition of a new economic factor at each stage. It also can be seen from Fig. 9 that the mean fixation duration on nonselected items is higher (by about 1 s) among males than females. This difference is statistically significant, according to t test, only for the second stage, with p ≈ .04 and a significance level of α = .05. The p values for the first and third stages are p ≈ .06 and p ≈ .12, respectively.

Fig. 9
figure 9

Comparison of mean fixation durations on nonselected items at the three stages. Error bars denote the sample error of the means

It can be observed from the results that the frequency of mean switching times among items stands between 2 and 3. In addition to this finding, the correspondence of a subject’s gaze at the selection moment with the selected item is also significant. A qualitative choice model for the experiment can be extracted from the mentioned results. This proposed model is illustrated in Fig. 10.

Fig. 10
figure 10

Proposed qualitative choice model for eye movements, based on our experimental results

The steps of the decision-making process in terms of eye movements, illustrated in Fig. 10—as a qualitative choice model—can be interpreted as follows. In this process, subjects look at one of the two items and recognize its features first. Thereafter, they look at the other item and recognize it, too. Thereupon, the gaze of the subjects come back to the first-viewed item. In parallel, a comparison and evaluating process on the items is performing in the mind. In light of the subject’s decision, if the chosen item was the second one viewed, the gaze of the subjects goes to that item location; otherwise, the gaze position remains fixed on the first item, and this one will be the chosen item.

Discussion

Although using more subjects and more items would make the results more reliable statistically, this research was aimed to introduce a novel approach to use to model the decision-making process based on eye movements. In this study, choices were time-constrained, which would have put more mental pressure on the subjects than in similar real-world situations. In terms of this account, this experiment could be performed in a free-time manner. Moreover, performing the three stages of the experiment on three different days and comparing the results would show better whether the subjects’ choices were based only on unsystematic personal preference, or whether economic bias should be taken into account. Another point to be mentioned here is that Items 13 to 16 from Table 1 were also measured for our analysis; however, the differences between the stages and genders were not statistically significant for those items. Therefore, it stands to reason that these parameters have no effect on a predicting model design.

Eyetracking solely would not be sufficient to present a general model for individuals’ choice predictions. Utilizing other measurement techniques, such as by recording EEG or ECG signals seems not to be practical in everyday applications. Hence, using other unobtrusive measurement techniques to present a general and reliable model would be needed. In this situation, another option would be to present an adaptive model for each person that could be trained with the person’s earlier decision-making behavior to predict oncoming choices.

Regarding the use of other techniques than median filtering to remove blink effects from the vertical component of the EOG signal, some points can be considered. Recent research on this issue has presented valuable results that can be useful for future works. In particular, Martínez et al. (2008) compared some hybrid filters with median filters, but median filters exhibited the best results. In other research, the detection function and FIR filtering failed to detect the occurrence of two blinks (Krishna & Amaranath, 2013). Apart from this, an optimization-based technique (Krupiński & Mazurek, 2011) seemed to be a more accurate method than simple median filtering; however, the former approach is more computationally demanding. To address this computational issue, an approach based on filter banks has been proposed (Krupiński & Mazurek, 2012).

Another issue is the use of EOG instead of common eyetracking systems. EOG has some advantages over other techniques of eyetracking, since it does not limit the field of vision and is more affordable than other types of eyetracking techniques. These benefits may make it an effective tool to use instead of EEG signals for researchers in the domain of brain–computer interface (Nakanishi, Mitsukara, Wang, Wang, & Jung, 2012; Usakli & Gurkan, 2010). Along with these studies, efforts are being made to develop algorithms for the automatic analysis of EOG data (Pettersson et al., 2013).