Introduction

Eye Movements and Sexuality

From an evolutionary perspective, attention is crucial for survival: attentional processes ensure that we identify potentially life-threatening objects, like snakes or spiders, as quickly as possible (Ohman, 2009). Furthermore, attention plays a role in gathering reproductive information (Krupp, 2008). Not surprisingly, attentional processes play an important role in current theories of sexual arousal (de Jong, 2009; Dekker & Everaerd, 1989). One possible way of exploring attentional processes is through the measurement of eye movements. One advantage of eye tracking is the ability to capture competitive attentional processes. The ability of humans to identify fine details is limited to two degrees of central vision making up the foveal region of the retina (Rayner & Pollatsek, 1992). Therefore, eye movements are necessary to capture an entire complex scene and to identify those scene aspects that attract the viewer’s attention. For scene perception, the relevant human eye movements can be divided into fixations and saccades. Saccades are voluntary or reflexive rapid eye movements, which switch the fovea from one stimulus aspect to the next. Acquisition of information cannot take place here. Fixations, however, are defined as a time period in which the eye does not move (with the exception of micro saccades) and acquisition of information occurs (Henderson & Hollingworth, 1999). The visual attention lies on the part of the visual field the eye is fixating (Just & Carpenter, 1976).

The aim of the current study was to explore attentional engagement through eye tracking methodology while simultaneously presenting two sexual stimuli whose only difference was in their sexual relevance for the subject. Thus, two sexual stimuli competed for the attention of the viewer. Another aspect of this study was to compare early (initial orienting) and late attentional processes (maintenance of attention) in a direct manner by measuring eye movements while viewing sexual stimuli.

Eye tracking has been applied since the end of the nineteenth century (Delabarre, 1898) but only over the last few years in respect to sexuality (Lykins, Meana, & Kambe, 2006). In the majority of recent eye movement studies in regards to sexuality, the focus has been on gender differences. Using erotic stimuli, Lykins et al. showed that men and women focused significantly longer on bodies than on faces and longer on faces than on contextual regions. When presenting non-erotic stimuli, men and women looked significantly longer at faces instead of bodies, and longer at bodies than contextual regions. In a subsequent study, Lykins, Meana, and Strauss (2008) showed that heterosexual men looked significantly more often at female pictures than at male pictures, whereas heterosexual women looked equally often at male and female stimuli. Tsujimura et al. (2009) explored eye movements using two sexual videos, one depicting sexual intercourse and one not. Only in videos depicting no sexual intercourse did men observed the opposite sex significantly longer than women, with women observing the same sex longer than men. Faces attracted the most attention for both men and women. Another eye-tracking study examined the influence of hormones on eye movements, incorporating men, women using contraceptives, and women not using contraceptives (Rupp & Wallen, 2007). The stimuli comprised pictures of men and women engaged in intercourse. Men looked more often at female faces, whereas women without contraceptive use looked more often at genitals. Women with contraceptives looked at contextual stimulus aspects with a higher frequency. The menstrual cycle had no effect on the eye movements. These eye tracking studies show noteworthy interesting gender differences in allocating attention to sexual and erotic images.

Other studies used eye tracking procedures to examine attentional processes while viewing sexually relevant images from a more evolutionary standpoint. For example, Suschinsky (2007) showed that male participants focused their attention more often on reproductive bodily regions (e.g., breasts and pubic area) than on other bodily regions. Images with lower waist-to-hip-ratios (WHR), and especially reproductively relevant regions in images with lower WHRs, received the most visual attention. Another study used eye-tracking procedures to explore how men examined images of naked women varying in WHR and breast size (Dixson, Grimshaw, Linklater, & Dixson, 2011a). The breasts and the waist received more first fixations than the face or the lower body parts (pubic area and legs). Men looked more often and longer at the breasts than at the head or the midriff, irrespective of the WHR of the images. In another study, there was no effect of breast size or areola pigmentation on the eye movements of heterosexual men (Dixson, Grimshaw, Linklater, & Dixson, 2011b). The results of these studies give important insight into attentional processing of sexually relevant stimuli and show that eye tracking is a promising tool in the research of sexuality.

Attention and Sexuality

Recent theories on sexuality accentuate the importance of attentional processes (de Jong, 2009). Based on the information processing approach model of Janssen, Everaerd, Spiering, and Janssen (2000), Spiering and Everaerd (2007) proposed a model assuming an interaction of automatic and controlled cognitive processes as well as an incremental influence of attentional processes on the subjective and physiological aspects of sexual arousal. Subjective sexual arousal was defined as an emotional experience, including the awareness of autonomic arousal, expectation of reward, and motivated desire (Everaerd, 1989). The model assumes that sexually relevant features of a stimulus are preattentively selected, and automatically trigger focal attention to these sexual aspects. If the preattentively selected sexual features match with sexual contents in the implicit memory, physiological arousal occurs in an automatic manner. If the preattentive physiological sexual arousal comes into consciousness, subjective sexual experience occurs. In addition to this automatic pathway, a controlled pathway exists: the focal attention to the sexually relevant stimulus induces conscious appraisal of these stimulus aspects. The outcome of the appraisal depends on the matching of the stimulus features with the sexual content in the explicit memory. If the stimuli are in accordance with the sexual scripts of the explicit memory, the viewer classifies the stimulus as sexual. This also induces a conscious experience of sexual arousal. Thus, the model assumes that sexual reactions depend on the appraisal of a stimulus, which incorporates interacting memory processes and attentional processes. Priming paradigms support the view that preattentive processing of sexual stimuli involves implicit, but not explicit, memory (Janssen et al., 2000; Spiering, Everaerd, & Janssen, 2003). On the other hand, several studies showed that decisions are made more slowly if an erotic element is present. Such an effect is known as Sexual Content-Induced Delay. This concept was introduced by Geer and Bellard (1996) and Geer and Melton (1997) and supports the proposed controlled pathway of processing sexual stimuli in a variety of experiments (e.g., Spiering, Everaerd, & Elzinga, 2002).

Important for the present study is that Spiering and Everaerd (2007) drew parallels to the processing of evolutionary fear-related stimuli by proposing that the preattentive selection of sexually relevant features of a stimulus can automatically trigger focal attention to the sexually relevant stimulus parts. Defining sexual arousal as an emotion (Everaerd, 1989), this seems to be a logical conclusion. For emotion processing, it is well known that there is a mechanism of automatic appraisal which involves the implicit memory and a conscious elaboration of emotional information with involvement of the explicit memory (Ledoux, 2000; Whalen et al., 1998). Understanding sexual arousal as an emotion, eye tracking studies in the context of emotion and fear become relevant and may provide experimental designs which will be useful to investigate the underlying attentional processes. One advantage of eye tracking is the chance to explore early and late attentional engagement in real-time. A widely accepted view assumes that a covert shift of attention is immediately followed by an overt gaze shift to the attended spatial location (Henderson, 1992; Reichle, Pollatsek, Fisher, & Rayner, 1998). For example, Calvo and Lang (2004) used eye tracking technique to demonstrate an attentional bias to emotional pictures by presenting pairs of emotionally negative and neutral pictures, and pairs of emotionally positive and neutral pictures. Results demonstrated that the probability of the first fixation and the proportion of viewing time during the first 500 ms were higher for both pleasant and unpleasant pictures than for neutral pictures. These effects disappeared after the first 500 ms. This suggests that emotional meaning engages attention early and captures initial overt orienting. Nummenmaa, Hyona, and Calvo (2006) replicated these findings and showed that the initial orienting toward emotional pictures seem to be, at least to some extent, an automatic process. Adapting this experimental approach to sexuality seems to be a promising tool in testing the assumptions of Spiering and Everaerd (2007) with regard to early and late attentional processes in the genesis of sexual arousal.

Aim of the Study and Hypothesis

Based on the current theoretical approaches, we defined sexual arousal as an emotion, and follow the model of Spiering and Everaerd (2007). Consequently, we hypothesized that (1) heterosexual men would show an initial orienting towards the sexually preferred stimulus when this stimulus was presented simultaneously with a sexually non-preferred stimulus, and that (2) heterosexual men would give more attention to the preferred sexual stimulus than to the non-preferred stimulus, when both stimuli were presented simultaneously.

Method

Participants

Twelve heterosexual males, ranging in age from 19 to 35 years (M = 25.58 years; SD = 4.89), participated in the current study. All participants were recruited by a notice posted on the campus of the University of Goettingen. Heterosexuality of the participants was assessed by the Kinsey scale asking for physical contacts (Kinsey, Pomeroy, & Martin, 1948), accepting only ratings from 0 to 1 (exclusively and predominantly heterosexual). All participants were without history of neurological or psychiatric illness, pedophilia or sexual offenses according to the DSM-IV criteria (American Psychiatric Association, 2000), assessed by a psychiatric and sexual anamnesis. While not a participation criterion none of the participants presently desired to have children or was a parent, so that the desire for children or actual involvement in parenting could not influence attentional effects. In accordance with the vote of the ethics committee of the University of Goettingen, participants were informed that they would see pictures of naked male and female adults and children and that the experiment was part of a larger study with pedophilic patients. All participants had normal visual acuity or to normal corrected visual acuity. All provided written informed consent before participating in the experiment. The study was approved by the ethics committee of the medical faculty of Georg-August-University of Goettingen.

Stimuli

The stimuli were selected from the Not-Real-People (NRP) picture set (Pacific Psychological Assessment Corporation, 2004). The NRP picture set contains a total of 160 colored pictures of nude and clothed male and female persons at five different stages of pubertal development based on the categorization of Tanner (1973). The pictures were non-pornographic in terms of explicit sexual poses or sexual activity. In this study, only 64 male and female nude persons from the Tanner stages 1, 2, 4, and 5 were used, four from each stage. In order to enlarge the useable stimulus set, each picture was mirrored with CorelDraw Graphics Suite X4 (Corel Corp.), so that there were eight female and eight male pictures for each Tanner stage. Pictures from Tanner stages 1 and 2 were combined into the category “child” and pictures from Tanner stages 4 and 5 were combined into the category “adult.” We assumed that for heterosexual men three categories of sexually non-preferred stimuli exists: men, boys, and girls.

The original NRP pictures had different colored backgrounds, luminance levels, and complexity levels. Differences in visual low-level features, such as luminance and contrast, automatically attract attention in a bottom-up process, whereas the semantic content of a stimulus directs attention controlled in a top-down process (Henderson, 2003). Without controlling the stimuli in respect to these low-level features, it is not possible to decide if recovered attentional effects are based on the semantic content of the stimulus (e.g., sexual relevant meaning) or on the low-level features. Therefore, all pictures were converted with CorelDraw Graphics Suite X4 (Corel Corp.) into grayscale pictures, and the colored backgrounds were replaced for all pictures by a consistent grayscale background. All pictures were preprocessed with self-written matlab scripts (Matlab Version 7.6.0, MathWorks Inc.) to match the pictures with respect to luminance and contrast. The luminance level was assessed by converting the images into the HSL color space, reading out the luminance value, and calculating the mean luminance value for the entire image. The complexity of the pictures was assessed in terms of the number of bytes of the compressed image file size in JPEG format. Studies demonstrated that the compressed image file size positively correlates with the image file complexity, as well as the human subjective judgment of picture complexity (Boudo, Sarlo, & Palomba, 2002; Forsythe, Mulhern, & Sawey, 2008). As luminance seems to influence the compression file size besides complexity (Zhang & Lu, 2004), our preprocessing for luminance should control this factor. The luminance and complexity levels were compared using one-way analyses of variance (ANOVA) in respect to the stimulus categories (girl, boy, woman, man). The stimulus categories did not differ in respect to luminance, F(3, 60) = 1.04, and complexity, F(3, 60) < 1. Thus, low-level stimulus characteristics are not likely to explain possible attention effects between the stimulus categories.

Each stimulus display consisted of two pictures presented in two opposing corners of the computer screen (top left/bottom right, top right/bottom left; see Fig. 1). The locations of the pictures were balanced across trials. In Experiment 1, the picture of a woman was combined with the picture of a girl or the picture of a man was combined with the picture of a boy. In Experiment 2, the picture of a woman was combined with one of a man or the picture of a girl was combined with that of a boy (see Fig. 1). The combination of pictures was pseudo-randomized, in that each picture was presented twice, but in two different combinations and with two different locations on the screen. Furthermore, the locations of the pictures were balanced across trials and their distance to each other was constant. The height of all pictures was 412 pixel (which equals to 11.4° of visual angle at a viewing distance of 60 cm), with varying widths between 91 pixel (2.5°) and 280 pixel (7.8°). The picture pairs were matched in respect to their width. The distance between the two pictures was 16.4° (distance from center of the first picture to the center of the second picture). Thus, there was a minimal distance of 4° from the innermost border of one picture to the innermost border of the second picture.

Fig. 1
figure 1

Illustration of the time sequence of an experimental trial, both for experiment I (a) and experiment II (b). The two experiments differed only in respect to the stimulus pairs. Note that these example pictures were not among the experimental stimuli

Eye-Tracking

Eye movements were measured using an SMI iView XTM RED eye tracker (SensoMotoric Instruments GmBH, Berlin, Germany) in combination with an iView XTM workstation by measuring the corneal reflection and dark pupil with a video-based infrared eye camera. The SMI RED system is a contact-free, remote-controlled eye tracking device with an automatic eye and head tracker. In this manner, little head movements are automatically compensated, rendering it unnecessary to immobilize the head using a bite bar. However, it was necessary that the participant sit still during the experiment. The iView XTM RED system works with a spatial resolution of <0.1° of visual angle, a temporal resolution of 60 Hz, and a gaze position accuracy of <0.4° of visual angle. The system works with most glasses and contact lenses.

Measures and Procedure

The experiment was scripted with Presentation® (Version 13.0; Neurobehavioral Systems Inc., Albany). Stimuli were presented on a 19-inch TFT-monitor (resolution 1280 × 1024 pixel) at a refresh rate of 75 Hz. The participants were seated in a quiet room facing the monitor at eye level at a viewing distance of 60 cm in front of the monitor. The experiment was divided in three phases. In the instruction phase, the participants read the instruction via the monitor. A task was introduced in which the participants had to compare the sexual attractiveness of the two presented stimuli in order to distract the participants from the eye-tracking. Participants were told that the eye movement measurements should assure their attention allocation to all pictures. They were also told that they had to look at both persons carefully to be able to make a qualified judgment of the attractiveness of the two persons. This is how we are able to have the participants look at all stimuli -both adults and children. In the practice phase, participants viewed eight test pictures of clothed persons in order to acclimate to the equipment and procedure. These eight test trials followed the same rationale as the main trials afterwards. The experiment phase consisted of Experiment 1 and Experiment 2, which only differed in respect to the stimulus combinations. The order of Experiment 1 and Experiment 2 was balanced across the participants. Each experiment consisted of 64 trials. The participants were given the chance to rest after the first experiment and after the first half of each experiment. At the beginning of each experiment and after each rest, a calibration was performed. The calibration consisted of having the participant fixate on nine points on the display area. Before each trial, a fixation cross (approximately 1° × 1°) appeared on the center of the screen. If the participant fixated on the fixation cross for at least 500 ms, the next trial started automatically. This ensured that every participant would be looking at the middle of the stimulus display at the beginning of a trial. Next, two pictures appeared and remained for 5000 ms. After each stimulus presentation, a question appeared (“Was one of these persons more sexually attractive?”) and the participant had to respond using the computer mouse. Figure 1 shows the experimental design.

Subsequent to the experimental phase, the participants rated all 64 stimuli in respect to sexual arousal and valence on a 9-point Likert scale. This was performed on the same monitor used during the experiment without measuring the eye movements. Every picture was presented in the middle of the screen with 9-point rating scales, one on the left for valence (1 = unpleasant, 9 = pleasant) and one on the right for sexual arousal (1 = not arousing, 9 = arousing). In order to assess the viewing time, the time from stimulus onset until the completion of the second rating (sexual arousal rating) was measured without the knowledge of the participant. The viewing time is an indirect measure of sexual relevance of a stimulus because it is positively correlated with the amount of sexual arousal a stimulus elicits (e.g., Harris, Rice, Quinsey, & Chaplin, 1996).

Data Analysis

Eye Movements

Eye movement data were analyzed with BeGazeTM 2 (SensoMotoric Instruments GmbH, Berlin, Germany). BeGazeTM 2 uses a dispersion-threshold identification algorithm to identify fixations; thus, if eye movements were stable within a circular area of 1° of visual angle for at least 100 ms, this was classified as a fixation (Salvucci & Goldberg, 2000). In order to analyze visual attention to the different aspects of the stimulus display, we divided each stimulus display into two areas of interests (AOIs). Each entire picture (woman, man, boy, girl) equates to one AOI. Obviously, the AOIs differed in respect to two factors: age and gender. Two dependent variables were calculated for each AOI: the number of first fixations and the relative cumulative fixation time (relative fixation time). The number of first fixations was defined as the number of all fixations which were located in the space of the relevant AOI, and which occurred first after onset of the stimulus. The number of first fixations is a measure for the initial orientation (Nummenmaa et al., 2006). Because subjects had to fixate a cross preceding the stimulus display, participants’ first fixations were typically in the middle of the stimulus display. Therefore, we used the second fixation as a measure of the first fixation the subject generated (Rupp & Wallen, 2007). The relative cumulative fixation time was defined as the sum of the fixation duration of all fixations located in the space of the relevant AOI, divided by the whole presentation time. The relative cumulative fixation time is a measure for the overall attention a specific AOI attracts (Ellis & Smith, 1985). The data were exported to SPSS (Version 17, SPSS Inc., Chicago). For every dependent variable, a 2 (age of the presented person: child vs. adult) × 2 (gender of the presented person: male vs. female) repeated measures ANOVA was calculated. Significant interactions were further analyzed with Bonferroni-adjusted post-hoc t-tests.

Subjective Ratings and Viewing Time

A 2 (age: child vs. adult) × 2 (gender: male vs. female) repeated measures ANOVA was calculated for the three dependent variables (sexual arousal rating, valence rating, viewing time). Significant interactions were further analyzed with Bonferroni-adjusted post-hoc t-tests. The subjective ratings and viewing time data of two participants could not be assessed due to technical difficulties.

Results

Sexual Arousal Ratings

Table 1 shows the means and SDs for the sexual arousal ratings as a function of stimulus type. A 2 (age) × 2 (gender) repeated measures ANOVA showed a significant main effect for age, F(1, 9) = 41.84, p < .001, η 2 = .82, a significant main effect for gender, F(1, 9) = 34.71, p < .001, η 2 = .79, and a significant age × gender interaction, F(1, 9) = 45.75, p < .001, η 2 = .84. Bonferroni-adjusted post-hoc t-tests showed that women were rated as significantly more arousing than men, t(9) = 6.58, p < .001, girls, t(9) = 7.60, p < .001, or boys, t(9) = 7.26, p < .001. Men were not rated significantly more arousing than girls, t(9) = 1.40, or boys, t(9) = 2.37. There was also no significant difference between girls and boys in respect to arousal rating, t(9) = 1.45.

Table 1 Means and SDs for the sexual arousal ratings, valence ratings, and viewing time as a function of stimulus type

Valence Ratings

Table 1 also shows the means and SDs for the valence ratings as a function of stimulus type. A 2 (age) × 2 (gender) repeated measures ANOVA showed a significant main effect for age, F(1, 9) = 30.76, p < .001, η 2 = .77, for gender, F(1, 9) = 13.71, p < .005, η 2 = .60, and a significant age × gender interaction, F(1, 9) = 13.37, p < .005, η 2 = .60. Bonferroni-adjusted post-hoc t-tests showed that women were rated significantly more pleasant than men, t(9) = 3.97, p = .003, boys, t(9) = 5.82, p < .001, or girls, t(9) = 5.84, p < .001. Men were not rated significantly more pleasant than boys, t(9) = 2.67, or girls, t(9) = .69. Girls were not rated significantly more pleasant than boys, t(9) = 2.06.

Viewing Time

Table 1 also shows the means and SDs for the viewing time as a function of stimulus type. A 2 (age: child vs. adult) × 2 (gender: male vs. female) repeated measures ANOVA showed a significant main effect for age, F(1, 9) = 5.61, p = .042, η 2 = .38. Adults (M = 6115.94 ms, SD = 1852.49) were viewed significantly longer than children (M = 4844.95 ms, SD = 1301.59). There also was a significant main effect for gender, F(1, 9) = 6.43, p = .032, η 2 = .42. Female pictures (M = 5811.32 ms, SD = 1577.98) were viewed significantly longer than male pictures (M = 5149.57 ms, SD = 1239.32). The age × gender interaction, F(1, 9) = 2.94, was not significant.

Gaze Data: Experiment 1

In Experiment 1, participants saw either the picture of a woman combined with the picture of a girl or the picture of a man combined with the picture of a boy.

Initial Orienting: Number of First Fixations and Probability of First Fixation

Initial orienting of attention was assessed using the number of first fixations within one of the specified AOIs. Table 2 shows the means and SDs for number of first fixations as a function of stimulus type. A 2 (age) × 2 (gender) repeated measures ANOVA revealed only a significant main effect for age, F(1, 11) = 11.14, p = .007, η 2 = .50. Participants showed significantly more first fixations on adults (M = 17.04) than on children (M = 11.21). Expressing the data in probability of first fixations, first fixations were allocated to women with a probability of 57.44% when paired with a girl (42.56%) and first fixations were allocated to men with a probability of 62.65% when paired with a boy (37.35%).

Table 2 Means and SDs for the number of first fixations and the relative cumulative fixation time in Experiment 1

Attentional Engagement Over Time: Relative Cumulative Fixation Time

The attentional engagement over the entire presentation time was measured using the relative cumulative fixation time. Table 2 also shows the means and SDs for the relative cumulative fixation time as a function of stimulus type. The 2 (age) × 2 (gender) repeated measures ANOVA revealed a significant main effect for age, F(1, 11) = 36.30, p < .001, η 2 = .77, a significant main effect for gender, F(1, 11) = 6.14, p = .031, η 2 = .36, and a significant age × gender interaction, F(1, 11) = 21.06, p < .001, η 2 = .66. The age × gender interaction was further analyzed by Bonferroni-adjusted post-hoc t-tests. The relative cumulative fixation time was significantly longer for women than for men, t(11) = 4.47, p < .001, girls, t(11) = 7.62, p < .001, or boys, t(11) = 6.14, p < .001. The relative fixation time for men was significantly longer than for girls, t(11) = 5.29, p < .001, but not significantly longer than for boys, t(11) = 2.05. Girls were fixated significantly shorter than boys, t(11) = −3.35, p = .006.

Gaze Data: Experiment 2

In Experiment 2, participants saw either the picture of a woman combined with the picture of a man, or the picture of a girl combined with the picture of a boy.

Initial Orienting (Number of First Fixations, Probability of First Fixation)

Table 3 shows the means and SDs for the number of first fixations as a function of stimulus type. The 2 (age) × 2 (gender) repeated measures ANOVA revealed only a significant main effect for gender, F(1, 11) = 12.68, p = .004, η 2 = .54. Female pictures (M = 16.21) attracted significantly more attention than male pictures (M = 12.83). Expressing the data in probability of first fixations, first fixations were allocated to women with a probability of 55.76% when paired with a man (44.24%) and first fixations were allocated to girls with a probability of 56.64% when paired with a boy (43.36%).

Table 3 Means and SDs for the number of first fixations and the relative cumulative fixation time in Experiment 2

Attentional Engagement Over Time: Relative Cumulative Fixation Time

Table 3 also shows the means and SDs for the relative cumulative fixation time as a function of stimulus type. The 2 (age) × 2 (gender) repeated measures ANOVA revealed a significant main effect for gender, F(1, 11) = 42.19, p < .001, η 2 = .79, and a significant age × gender interaction, F(1, 11) = 28.18, p < .001, η 2 = .72. The main effect for age was not significant, F(1, 11) = 3.02. Post-hoc t-tests revealed a significantly higher percentage of cumulative fixation time for women than for men, t(11) = 6.30, p < .001, girls, t(11) = 4.30, p < .001, or boys, t(11) = 5.12, p < .001. Men attracted a significantly lower percentage of relative fixation time than boys, t(11) = −4.82, p < .001, or girls, t(11) = −6.54, p < .001. Girls and boys did not differ significantly in respect to the relative cumulative fixation time, t(11) = 2.34.

Discussion

Our study used eye tracking methodology to explore early and late attentional processes in heterosexual men, while showing them pairs of simultaneously presented sexually preferred and sexually non-preferred images which competed for attention.

Self-report rating data, measured without eye tracking, showed that images of naked women were more sexually arousing and more pleasant than images of men, boys or girls. Hence, as expected, heterosexual male participants preferred images of women sexually more than images of men, boys or girls. One critical point to consider is the fact that both subjective ratings were in the lower range of the 9-point Likert scale. This seems to indicate that the NRP-Set cannot at all induce high sexual arousal, nor was it very pleasant, not even for the preferred sexual stimuli. To our knowledge, a published subjective rating data for the NRP-Set does not exist. Despite this fact, some studies have used this stimulus set successfully in studying subjects with different sexual orientations (e.g., Mokros, Dombert, Osterheider, Zappala, & Santila, 2010). Another critical point is that the study information provided to the subjects contained the hint that the study was part of a larger study with pedophilic participants. This could have influenced the subjects’ behavior in a social desirable way. Without additionally data from pedophilic subjects we can not preclude the assumption that the provided information was a confounding variable. From an ethical point of view we prefer to inform the subjects about the aim of the study.

The analysis of the viewing time data did not show a significant interaction effect. We only found main effects for age and gender, with longer viewing times for females and adults. Even though there was not a significant interaction effect, the means showed that women were viewed longer than men, boys or girls. Viewing time seems to correlate positively with psychophysiological sexual arousal (e.g., Harris et al., 1996; Quinsey, Ketsetzis, Earls, & Karamanoukian, 1996). Despite the fact that it is not yet clear what the reason for the viewing time effect may be, it seems to at least be robust and reliable (e.g., Imhoff et al., 2010; Sachsenmaier & Gress, 2009). Therefore, the viewing time data seem to support our interpretation of the subjective sexual arousal rating data.

In regards to late attentional processes, one major finding of our study was that over the whole presentation time, heterosexual male participants fixated on their preferred sexual stimulus (woman) longer than their non-preferred sexual stimuli (man, girl and boy). This result supported our hypothesis that heterosexual men would divert more attention to the preferred sexual stimulus than to the sexually non-preferred stimulus when both stimuli were presented simultaneously. In that way, the finding provides further evidence that sexual stimuli attract more attention than non-sexual stimuli (e.g., Prause, Janssen, & Hetrick, 2007). Moreover, the findings were in accordance with the results of the study by Lykins et al. (2008) showing that heterosexual men looked significantly more often at female pictures than at male pictures.

The results for early attentional processes showed that the first fixation was more often directed towards the preferred sexual stimulus, when simultaneously presented with a non-preferred sexual stimulus. This result supports our hypothesis that heterosexual men would show an initial orienting towards their sexually relevant stimulus, when said stimulus was presented simultaneously with a sexually irrelevant stimulus. As the stimuli did not appear to differ in some key, low-level features, it seems reasonable to assume that this initial orienting effect was based on differences pertaining to sexual relevance and valence. Thus, we could replicate prior findings with emotionally positive and negative pictures (Calvo and Lang, 2004; Nummenmaa et al., 2006) because our stimuli can also be seen as emotionally positive images. Beyond that, our study showed an initial orienting toward sexually preferred stimuli. It was hypothesized that sexually relevant stimuli would be processed the same as other evolutionarily relevant stimuli (Spiering & Everaerd, 2007). Our findings on the number of first fixations support this hypothesis; they demonstrate that sexually preferred stimuli were prioritized by the human attentional system.

Furthermore, the current data were also consistent with the theoretical model of Spiering and Everaerd (2007), proposing that sexual features of a stimulus get more focal attention than sexually neutral stimulus features. The relative cumulative fixation time is a measure for the conscious allocation of attention (Ellis & Smith, 1985). Using this parameter, we showed that men looked longer at women than at men, boys or girls. Of course, these results do not support the whole model, but they give evidence for the proposed attentional processes at the first stage of this model. In addition, Spiering and Everaerd (2007) proposed an automatic selection of sexually relevant stimulus features. As the first fixation was more often directed towards the preferred sexual stimulus, the results give preliminary evidence for this proposition. However, with the current design, it was not possible to determine if the initial orienting was automatic or controlled, because we did not test to what extent the subjects were able to consciously control the direction of the first fixation. Hence, it is necessary to explore this question in a direct experimental manner, as was done by Nummenmaa et al. (2006) for emotional pictures by instructing the subjects to actively avoid looking at the emotional stimulus.

In respect to the number of first fixations, two unexpected results were observed. When the sexually preferred stimulus was not present, then the adult male image (Experiment 1) or the female child image (Experiment 2) attracted more first fixations. We had no specific a priori hypothesis for the neutral–neutral stimulus combinations. The result that images of girls attracted more fixations than images of boys are in agreement with the penile response profile of heterosexual men based on a huge sample (n = 1066; Blanchard et al., 2010). This profile shows that women elicit the highest sexual response, followed by pubescent girls, prepubescent girls, and prepubescent boys. Pubescent boys and adult men elicit the lowest sexual response. Blanchard et al. proposed two possible psychophysiological models to describe this result. The first model (summation model) conceptualizes age and gender as separate stimulus dimensions of a sexual stimulus. The second model (bipolar model) assumes that men respond to a potential sexual stimulus as a gestalt, that they evaluate in terms of global similarities to other potential sexual objects. Based on additional sexual response profiles for subjects with different sexual orientations, Blanchard et al. concluded that the bipolar model best described the different sexual response profiles. If we apply the bipolar model to our data, it would predict that, for heterosexual men, images of women elicit the highest sexual response, followed by girls and boys. Therefore, the bipolar model can also explain our result that images of girls attracted more fixations than images of boys. The bipolar model assumes, on the other hand, that pictures of men will elicit the lowest sexual response. This did not agree with our finding that the man received more first fixations than the boy, when the image of a man was combined with the image of a boy. A possible explanation for this finding could be the fact that adult men are from an evolutionary perspective romantic rivals. Studies have shown that men pay particular attention to other men who might be perceived as romantic competitors (Maner, Gailliot, Rouby, & Miller, 2007). The amount of attention paid to romantic competitors depends on several factors, such as the amount of romantic jealousy or the attractiveness of the competitor, and seems to be based on implicit cognitions (Maner, Miller, Rouby, & Gailliot, 2009). Since we did not collect data in this respect, it cannot be determined whether one of these factors was true for our sample and has influenced the results.

In summary, the current study for the first time showed an attentional bias to sexually relevant stimuli when presented simultaneously with sexually irrelevant pictures. This finding, together with the finding that heterosexual men maintained their attention to sexually relevant stimuli, emphasizes the importance of investigating early and late attentional processes while viewing sexual stimuli. Furthermore, the current study showed that sexually relevant stimuli were favored by the human attentional system. Despite the relatively small sample size, the results in regards to the maintenance of attention offer strong effect sizes and make further research reasonable. On the other hand, the effect sizes for the number of first fixations appeared low compared to the effect sizes for relative fixation time. Therefore, future research should include more participants in order to enhance the statistical power. Future research should also include participants with homosexual and heterosexual orientation, in order to explore to what extent early and late attentional processes differ when it comes to sexual orientation, as well as investigate whether the first fixation is automatic or underlies conscious cognitive control. If the hypothesis of an automatic process is confirmed, our experimental design could be a useful tool to detect sexual orientation or sexual preferences independently from social desirability. Moreover, we paired adult and child images and were able to show that heterosexual men looked significantly longer at adult females than at child females. It may be hypothesized that participants with a deviant pedosexual preference will show an inverse attentional bias. Thus, this design could possibly be used to detect sexual deviant preferences such as pedophilia.