Open Access
Original Paper

Archives of Sexual Behavior

, Volume 39, Issue 1, pp 5-56

Agreement of Self-Reported and Genital Measures of Sexual Arousal in Men and Women: A Meta-Analysis

Authors

  • Meredith L. Chivers
    • Department of PsychologyQueen’s University
  • Michael C. Seto
    • Royal Ottawa Health Care Group
  • Martin L. Lalumière
    • Department of Psychology and NeuroscienceUniversity of Lethbridge
  • Ellen Laan
    • Department of Sexology and Psychosomatic Obstetrics and Gynecology, Academic Medical CenterUniversity of Amsterdam
  • Teresa Grimbos
    • Department of Human Development and Applied PsychologyUniversity of Toronto

DOI: 10.1007/s10508-009-9556-9

Abstract

The assessment of sexual arousal in men and women informs theoretical studies of human sexuality and provides a method to assess and evaluate the treatment of sexual dysfunctions and paraphilias. Understanding measures of arousal is, therefore, paramount to further theoretical and practical advances in the study of human sexuality. In this meta-analysis, we review research to quantify the extent of agreement between self-reported and genital measures of sexual arousal, to determine if there is a gender difference in this agreement, and to identify theoretical and methodological moderators of subjective-genital agreement. We identified 132 peer- or academically-reviewed laboratory studies published between 1969 and 2007 reporting a correlation between self-reported and genital measures of sexual arousal, with total sample sizes of 2,505 women and 1,918 men. There was a statistically significant gender difference in the agreement between self-reported and genital measures, with men (r = .66) showing a greater degree of agreement than women (r = .26). Two methodological moderators of the gender difference in subjective-genital agreement were identified: stimulus variability and timing of the assessment of self-reported sexual arousal. The results have implications for assessment of sexual arousal, the nature of gender differences in sexual arousal, and models of sexual response.

Keywords

Sexual psychophysiology Sexual arousal Sex difference Gender difference Plethysmography Photoplethysmography

Introduction

The human sexual response is a dynamic combination of cognitive, emotional, and physiological processes. The degree to which one product of these processes, the individual’s experience of sexual arousal, corresponds with physiological activity is a matter of interest to many researchers and practitioners in sexology because subjective experience (or self-report) and genital measures of sexual arousal do not always agree. In this article, we label this correspondence concordance or subjective-genital agreement.

Examples of low subjective-genital agreement abound in both clinical and academic sexology. Some men report feeling sexual arousal without concomitant genital changes (Rieger, Chivers, & Bailey, 2005) and experimental manipulations can increase penile erection without affecting subjective reports of sexual arousal (Bach, Brown, & Barlow, 1999; Janssen & Everaerd, 1993). Similarly, some women show genital responses without reporting any experience of sexual arousal (Chivers & Bailey, 2005) and self-reported sexual arousal is subject to impression management, as in the greater reluctance among women high in sex guilt to report feeling sexually aroused (Morokoff, 1985).

Thus, determining the extent of the agreement between self-reported and genital measures of sexual arousal has both practical and theoretical significance. Practically, the majority of researchers and clinicians who assess sexual arousal do not have access to measures of genital response and, therefore, often rely on self-report. Those who employ self-report measures would like to know the extent to which they are measuring the same response as clinicians or researchers who use genital measures and vice versa. Moreover, knowing the extent of the agreement between self-reported and genital measures of sexual arousal, and identifying moderators of this subjective-genital agreement, would inform our models of sexual response, our understanding of sexual dysfunctions, and psychometric methods to assess each aspect of sexual response.

One of the most frequently suggested moderators of subjective-genital agreement is gender; studies of men tend to produce higher correlations between measures of subjective and genital sexual arousal than studies of women (for a narrative review, see Laan & Janssen, 2007). Two positions can be described regarding gender as a moderator of subjective-genital agreement. One position is that female and male sexual response systems are truly similar, but the lower concordance estimates observed among women are the result of methodological issues in these studies, such as differences in the assessment devices or procedures that are used. The other position accepts the gender difference in concordance as real, whether it is a result of fundamental differences in sexual response or the effects of learning and other environmental influences. Before we can determine which of these positions has merit, however, the size and direction of the gender difference in concordance must be clearly documented.

The Present Study

The purpose of this meta-analysis was to provide a quantitative review of the sexual psychophysiology research examining self-reported and genital sexual arousal in women and men. The primary goal was to determine if a gender difference in the concordance between psychological and physiological measures of sexual response was observed across these studies. We also examined potential moderators of concordance to determine the extent to which the observed gender difference in concordance might represent a real gender difference or methodological artifact, and to test theoretically-derived hypotheses drawn from sexual selection, information processing, and learning theories regarding factors that influence human sexual response. We focused on these particular theories as compelling ultimate or proximate explanations of gender differences in sexual response, respectively, and because we could test hypotheses drawn from these theories using the variables that could be coded in this meta-analysis. Potential moderators are discussed in the next section.

The Gender Difference in Concordance is Due to Methodological Artifact

It may be that there is no real gender difference in concordance, but the current methods of assessing self-reported and genital sexual responses attenuate concordance estimates in women or increase concordance estimates in men. The gender difference in concordance, therefore, might be the result of methodological factors. These could occur at any stage of a laboratory paradigm designed to evoke sexual arousal, have participants assess their sexual response, objectively measure their sexual response, and calculate an index of subjective-genital agreement. These stages involve variation in stimulus characteristics (modality, content, length, and variation in sexual stimuli); assessment of self-reported sexual arousal (method of reporting, timing, operationalization); assessment of genital sexual arousal; statistical methods (type of correlation, number of data points); and participant characteristics (age for both men and women and hormonal fluctuations for women). Below, we consider each set of moderators and the influence they may have on concordance estimates.

Stimulus Characteristics

Stimulus Modality

Sexual arousal is typically elicited in laboratory settings by exposure to internal (fantasy) or external (visual images or audiotaped descriptions of sexual acts) sources of sexual stimuli. Modality effects have been observed, such that women show greater subjective and genital responses to audiovisual depictions of sexual activity compared with audiotaped descriptions of sexual interactions or sexual fantasy (Heiman, 1980; Stock & Geer, 1982). Men also demonstrate greater subjective and genital responses to audiovisual depictions of sexual interactions compared with audiotaped descriptions of sexual activity or still pictures of couples engaged in intercourse (Sakheim, Barlow, Beck, & Abrahamson, 1985). Audiovisual depictions of couples engaged in intercourse yield greater genital responses in both women and men than do still photographs of nude women and men (Laan & Everaerd, 1995a, 1995b; Laan, Everaerd, van Aanhold, & Rebel, 1993; Mavissakalian, Blanchard, Abel, & Barlow, 1975). It is unclear, however, which modality of sexual stimulus, if any, produces greater concordance of subjective and genital responses in either women or men.

There is evidence of a gender difference in responses to specific sexual content. Still photographs of nude or partially clothed women or men do not generate either self-reported or genital sexual arousal in heterosexual women (Laan & Everaerd, 1995a, 1995b), but are sufficient to generate substantial subjective and genital responses in heterosexual men (Tollison, Adams, & Tollison, 1979). For men, depictions of affectionate, nonexplicit interactions (e.g., cuddling, kissing) between clothed women and men significantly increase subjective and genital responses but, for women, both significant arousal responses and null effects to these same stimuli have been reported (Suschinsky, Lalumière, & Chivers, 2009; Wincze, Venditti, Barlow, & Mavissakalian, 1980). More recently, we have found that both heterosexual and homosexual men, and homosexual women but not heterosexual women, showed genital responses to film depictions of their preferred sex engaged in nude, nonsexual activities, such as walking on the beach (Chivers, Seto, & Blanchard, 2007).

The inclusion of sexual vocalizations (such as sighs, moans, and grunts) in audiovisual sexual stimuli augments both subjective and genital responses among men (Gaither & Plaud, 1997), but not among women (Lake Polan et al., 2003). Other data suggest that including vocalizations amplifies self-reported sexual arousal in both women and men (Pfaus, Toledano, Mihai, Young, & Ryder, 2006). For other kinds of content, however, both women and men respond similarly: audiovisual depictions of couples engaging in sexual intercourse elicited greater subjective and genital responses than did films of solitary women or men masturbating in both sexes (Chivers et al., 2007).

According to sexual selection theory, the importance of visual sexual cues to sexual responses may differ between women and men, influencing their appraisal of their sexual arousal. Symons (1979) discussed gender differences in processing of visual and nonvisual forms of erotica and suggested that visual cues are more salient for men. A study examining brain activation during sexual arousal in women and men supports Symons’ notion that visual sexual stimuli possess greater reward value for men, as evidenced by differential activation of reward-related pathways (Hamann, Herman, Nolan, & Wallen, 2003). This study did not compare activation patterns between visual and other modalities of sexual stimuli, however, so it is not clear whether differential activation pertained to sexual stimuli in general or to only particular modalities of sexual stimuli.

For men, we expected concordance to be higher for visual (photographs, videos) than for nonvisual (fantasy, text, auditory descriptions) modalities. For women, we speculated, based on observations of the greater female consumption of nonvisual forms of erotic literature (for reviews, see Malamuth, 1996; Salmon & Symons, 2003), that concordance would be greater when assessed using nonvisual modalities of sexual stimuli.

We also hypothesized that, for women, self-generated sexual stimuli (sexual fantasies) would result in significantly greater concordance than stimuli produced by others, even though fantasy is likely to produce lower levels of genital response (Heiman, 1980). The rationale for this hypothesis is that self-generated stimuli are less likely to evoke negative affect because women would be unlikely to imagine content that they find unpleasant. Negative affect appears to reduce self-reported sexual arousal and thus could reduce concordance (Laan & Janssen, 2007). Consistent with this idea, one study reported that subjective-genital agreement was greater among women with sexual arousal disorder when sexual fantasy was used as a sexual stimulus; concordance was positive with sexual fantasy, negative while listening to audiotaped stories, and not significantly different from zero during a film presentation (Morokoff & Heiman, 1980). Another study showed greater concordance estimates during sexual fantasy compared to audiotaped stories during the luteal phase of the menstrual cycle (Schreiner-Engel, Schiavi, & Smith, 1981).

Stimulus Length

We predicted that stimulus length would be positively associated with concordance, to the extent that longer stimuli could produce greater variation in subjective and genital responses and thus would allow for larger correlations than shorter stimuli that produced less variation. Response variation delimits the size of the correlation that can be obtained: Thus, an absence of both subjective and genital sexual arousal during a stimulus presentation would indicate high concordance, yet would produce a correlation of zero; the same situation applies if a person produces maximal subjective and genital responses throughout a stimulus presentation. Stimulus length may also be associated with concordance because longer stimuli could produce more reliable estimates of response than shorter stimuli.

Stimulus Variation

Increasing the potential variability in self-reported and genital sexual responses can also be achieved by using a range of sexual stimuli that vary in specific content and modality; variation in both self-reported and genital sexual arousal across different modalities of sexual stimuli has been observed for both women and men (Heiman, 1980; Sakheim et al., 1985). We would also expect higher subjective-genital agreement for studies that presented both preferred and nonpreferred stimuli, because showing only preferred or only nonpreferred stimuli1 may result in a restriction of range in response, thereby restricting the potential magnitude of a subjective-genital correlation.

Measurement of Self-Reported Sexual Arousal

Method of Reporting

Self-reported sexual arousal (subjective response) is the individual’s appraisal and report of their emotional state of sexual arousal. Most researchers ask participants to rate their sexual arousal either after the presentation of a sexual stimulus, using Likert-type items, or during the presentation of a sexual stimulus, using an apparatus such as a lever that the participant can move as they subjectively respond to the stimulus. Laan (1994) reported good internal consistency (α = .82) for a measure of women’s self-reported sexual arousal that included the following items: overall sexual arousal, strongest sexual arousal, and genital sensations, all rated using unipolar visual analog scales. No research on other measures of subjective response reliability, such as test–retest reliability, has been reported in the literature.

Operationalization

It is important to note that a subjective sexual response is defined differently here from a self-reported genital response; the former refers to appraisal of an emotional state of sexual arousal whereas the latter is a subjective estimate of the extent of one’s physiological responding, such as estimating the percentage of erection attained for men or the perception of genital sensations or wetness for women. It is not clear how much of one’s appraisal of subjective sexual arousal is influenced by one’s perception of genital responding, or vice versa, but these measures are highly positively correlated in both women (Slob, Bax, Hop, Rowland, & van der Werff ten Bosch, 1996) and men (Rowland & Heiman, 1991). Examining the correlations between these two self-report measures and physiologically-measured genital sexual arousal may inform understanding of how experiencing sexual arousal and perceiving physical changes are related to genital response.

Timing of Assessment

Subjective sexual arousal can be assessed at different times during sexual psychophysiology data acquisition. Logically and statistically, the potential correlation between subjective and genital sexual arousal should be highest when they are measured contiguously. Conversely, the correlation should be lower when subjective response is recorded after a trial has ended, or after a study session has ended, because the participant’s report of their subjective sexual arousal may be influenced by recall or other kinds of cognitive biases. Even if it is not correct that contiguous assessment produces the highest correlation between subjective and genital arousal, researchers and clinicians are probably most interested in concordance for arousal responses that occur simultaneously and less interested in the agreement between genital response at a particular point in time with subjective response after some time has elapsed (e.g., genital responses during a short period of sexual stimulation and subjective responses minutes later, after the sexual stimulation has stopped).There is evidence that using a lever, or similar device, contiguous with processing of a sexual stimulus does not affect genital responding in women, but does result in lower genital responses in men (Wincze et al., 1980), perhaps as a result of distraction (Geer & Fuhr, 1976). Some researchers have reported that assessing subjective sexual arousal contiguously results in lower concordance than using post-trial ratings in women (Laan et al., 1993). Thus, we predicted that the timing of the assessment of subjective sexual arousal would moderate concordance and that gender differences in the effects of assessment timing on sexual response might help explain the gender difference in concordance that has been observed.

Measurement of Genital Sexual Arousal2

Phallometry

Various physiological parameters, such as pupil dilation, heart rate, and galvanic skin response, have been examined as potential objective measures of sexual arousal, but changes in penile erection, assessed using penile plethysmography, are the most specific measure of sexual response in men (Zuckerman, 1971). An objective method of measuring penile erection was developed by Freund (1963). Changes in penile circumference, measured using a gauge placed around the shaft of the penis, or changes in penile volume (assessed using gas displacement in a sealed cylinder placed over the penis) are the most commonly used methods in sexual arousal research. Increases in penile circumference or volume are interpreted as evidence of greater genital sexual arousal. Circumferential and volumetric measurements are highly correlated when men show at least 2.5 mm of penile circumference change in the laboratory (Kuban, Barbaree, & Blanchard, 1999).

Regarding discriminative validity, penile responses can distinguish heterosexual and homosexual men, men who are sexually attracted to prepubescent children from those who are sexually attracted to adults, fetishists from nonfetishists, rapists from nonrapists, and sadistic men from nonsadistic men (e.g., Blanchard, Klassen, Dickey, Kuban & Blak, 2001; Freund, 1963; Freund, Seto, & Kuban, 1996; Lalumière, Quinsey, Harris, Rice, & Trautrimas, 2003; Sakheim et al., 1985; Seto & Kuban, 1996). Penile responses can also distinguish sexually functional men from men with sexual dysfunctions, such as men with premature ejaculation (Rowland, van Diest, Incrocci, & Slob, 2005).

Regarding predictive validity, phallometrically-assessed sexual arousal to stimuli depicting children or sexual violence is an important predictor of sexual reoffending among sex offenders (Hanson & Morton-Bourgon, 2005). Although the predictive validity of phallometrically-assessed sexual arousal in nonforensic samples has not been systematically examined, one study found that penile responses to sexual stimuli in the laboratory were related to increases in sexual behavior on the day following the laboratory session (Both, Spiering, Everaerd, & Laan, 2004).

Vaginometry

Objective assessment of the female genital response began in the 1970s with the development of the vaginal photoplethysmograph (Sintchak & Geer, 1975; for a more thorough discussion, see Geer & Janssen, 2000). The photoplethysmograph—a small, acrylic probe the size of a menstrual tampon—records haemodynamic changes in the vaginal epithelium using light reflectance. The photoplethysmograph signal is filtered into two components: vaginal blood volume (VBV), which reflects slow changes in blood pooling, and vaginal pulse amplitude (VPA), which reflects phasic changes in vasocongestion with each heart beat. These two vaginal signals have different properties (Hatch, 1979). Changes in VPA are specific to sexual stimuli, while VBV appears to increase in response to both sexual and anxiety-inducing stimuli (Laan, Everaerd, & Evers, 1995; Suschinsky et al., 2009). VPA typically returns faster to a pretrial baseline response than VBV (Laan, Everaerd, & Evers, 1995). No consistent evidence for menstrual cycle effects on VPA has been observed (Hoon, Bruce, & Kinchloe, 1982; Meuwissen & Over, 1992; Schreiner-Engel et al., 1981). Because the VPA signal demonstrates better psychometric properties than VBV, the majority of researchers have reported VPA.

Vaginometry using VPA demonstrates good reliability (Prause, Janssen, Cohen, & Finn, 2002; Wilson & Lawson, 1978) and there is evidence of its predictive and discriminative validity. Vaginal responses to sexual stimuli are related to increases in post-laboratory sexual behavior (Both et al., 2004). VPA assessed during baseline response can distinguish premenopausal women from postmenopausal women (Brotto & Gorzalka, 2002; Laan, van Driel, & van Lunsen, 2008; Laan, van Lunsen, & Everaerd, 2001), and VPA assessed during sexual response can differentiate heterosexual from homosexual women when stimuli depicting solitary males and females are used (Chivers et al., 2007). It is unclear whether VPA can discriminate sexually dysfunctional from functional women. The majority of studies find no differences in genital response between sexually functional and dysfunctional groups (for additional data and a review, see Laan et al., 2008). One study, however, did find VPA differences between subgroups of women with and without female sexual arousal disorder, using the newer sexual dysfunction definitions (Basson et al., 2003) discriminating between women reporting absent or impaired genital sexual arousal (genital sexual arousal disorder), women reporting absence of or markedly diminished feelings of sexual arousal (subjective sexual arousal disorder), and women reporting absence of, or markedly diminished feelings of sexual arousal, sexual excitement, or sexual pleasure (combined genital and subjective sexual arousal disorder) (Brotto, Basson, & Gorzalka, 2004). This discrepancy in the discriminative validity of VPA reflects recent changes in DSM-IV criteria for female sexual arousal disorder (FSAD); FSAD groups from previous studies are best compared to the combined genital and subjective sexual arousal disorder group used by Brotto et al., and these women did not differ from the control group in terms of genital responsiveness. Poor discrimination in genital response on the basis of sexual functioning, however, likely reflects uncertainty regarding the role of genital response in FSAD rather than psychometric limitations of VPA.

The type of signal obtained from vaginal photoplethysmography may affect concordance. Because VPA and VBV reflect different, though related, vasocongestive processes (Geer & Janssen, 2000), it is possible they are differentially related to subjective sexual arousal. For example, VBV has been shown in one study to be more reactive to negative affect (Laan, Everaerd, & Evers, 1995). VBV may, therefore, correlate better with self-reports of sexual arousal than VPA, to the extent that self-reported sexual arousal is influenced by emotional state.

No consensus exists on how VPA data should be transformed prior to statistical analysis (Hatch, 1979). Because the unit of change (mV) does not correspond to a clearly meaningful physiological correlate3 (Levin, 1992), comparisons between women using raw mV change scores are difficult to interpret. Some authors report change in VPA as a percent increase over baseline (Both et al., 2004; Heiman, 1977), sometimes called the “maximum change technique” (Rellini & Meston, 2006). Another means of transforming VPA is to ipsatize genital response data within participants (Chivers, Rieger, Latty, & Bailey, 2004). Responses are, therefore, expressed as a function of the individual’s own distribution of responses across a set of sexual stimuli, in SD units, making relative comparisons of responses to different stimuli across participants meaningful. Another method is to log-transform genital VPA, because raw scores typically demonstrate positive skew (Meston, 2006). It is unknown which method of data reduction for VPA data is the best in terms of maximizing the discriminative or predictive validity of vaginal photoplethysmography.

Thermography

The second most commonly used physiological measure of female genital response is thermography, most commonly assessed using a labial thermistor. This device consists of a thermistor placed on a small clip that is attached to the labia minora. It measures changes in skin temperature of the labia minora during genital vasocongestion. The labial thermistor has also shown good psychometric properties. Labial temperature reliably increases with exposure to sexual, but not neutral, stimuli (Henson, Rubin, Henson, & Williams, 1977). For the majority of women, VPA, VBV, and labial temperature are positively correlated with each other during sexual stimuli, but agreement between these genital measures is variable during presentations of nonerotic stimuli (Henson, Rubin, & Henson, 1979).

Payne and Binik (2006) have argued that labial temperature is a more consistent measure of genital response than VBV or VPA and is more strongly correlated with self-reported sexual arousal than VPA. Labial temperature is unaffected by orgasm, unlike VPA (Henson, Rubin, & Henson, 1982). At the same time, menstrual cycle effects have been reported for labial temperature change recorded during the follicular and luteal phases of the menstrual cycle (Slob, Ernste, & van der Werff ten Bosch, 1991; Slob et al., 1996). Onset of change in labial temperature is typically slower than VPA and temperature takes longer to return to a pretrial level of response; some studies reported that labial temperature was more consistent between testing sessions than VPA or VBV (Payne & Binik, 2006, but see also Slob, Koster, Radder, & van der Werff ten Bosch, 1990; Slob et al., 1991, 1996). Unlike VPA, the units of change are in Celsius degrees, and thus the unit of response can be directly compared across participants. On the other hand, unlike VPA, thermistor readings may be subject to a ceiling effect wherein genital responding continues to increase but labial temperature reaches a physiological maximum.

The type of genital measure used to assess sexual response in women may affect concordance estimates. Vaginal photoplethysmography measures changes in vaginal vasocongestion, which is not directly perceptible in women, whereas thermography measures change in the temperature of the external genitalia, which may be more perceptible. There is some preliminary evidence suggesting that awareness of changes in body temperature is related to feelings of sexual arousal; using factor analysis, Laan (1994) reported that awareness of labial temperature change loaded onto both a sexual feelings and a physical feelings factor, suggesting some overlap between temperature changes during both sexual and general physiological arousal. Thus, thermography may produce higher agreement with self-reported sexual arousal than vaginal photoplethysmography, to the extent that subjective response is influenced by an awareness of genital sensations. In addition, VPA may assess initial changes in blood flow before any labial temperature change occurs, and this may also affect concordance estimates because VPA would capture a fuller range of genital response.

Comparisons Between Female and Male Sexual Response

Though they measure the same genital process of vasocongestion, vaginal photoplethysmography and penile plethysmography use different physiological endpoints to estimate sexual response; vaginal photoplethysmography uses light reflectance to assess color change in the vaginal epithelium, whereas penile plethysmography assesses changes in the size of the penis. If an identical physiological endpoint, such as temperature change using thermography, was used for both women and men, would the gender difference in concordance still be found? For both women and men, awareness of temperature changes in the genitals may be an important cue of sexual arousal. Alternatively, information processing theory as applied to sexual functioning (see below) would posit that, regardless of physiological endpoints, a woman’s experience of sexual arousal is not highly influenced by her perception of physiological changes. It is unclear whether type of genital assessment method moderates a gender difference in concordance.

Statistical Methods

Type of Correlations

There are at least two ways of thinking about the concordance of self-reported and genital sexual arousal (Geer & Janssen, 2000). The first way has to do with the extent to which self-reported and genital responses agree within an individual; in other words, do individual changes in self-reported sexual arousal correspond with a similar change in genital sexual arousal, across different stimuli and different conditions? This type of concordance can be estimated by calculating within-subjects correlations between measures of self-reported and genital sexual arousal. Each participant produces a set of data points that is used to calculate a within-subjects correlation for that participant, which can then be averaged across the participants in a group (Bland & Altman, 1995a).

The second way to think about concordance has do to with the extent to which self-reported and genital responses agree within a group; in other words, are the individuals who produce the largest self-reported responses also the same individuals who produce the largest genital responses? This type of concordance can be estimated by calculating between-subjects correlations between measures of self-reported and genital sexual arousal. Each participant produces a pair of data points and the set of data points for the participants in a group are used to calculate a between-subjects correlation (Bland & Altman, 1995b).

It is possible that the gender difference in subjective-genital agreement depends on how concordance is conceptualized and thus how the correlation is calculated. In addition, researchers may be more interested in intra-individual or intra-group concordance, depending on the questions they are examining. We therefore examined the impact of type of correlation calculation in this meta-analysis.

Number of Data Points

Statistically, more reliable measurements tend to be obtained with a higher number of observations, so studies using larger samples (between-subjects correlations), a greater number of stimulus trials (within-subjects correlations), or a greater number of measurement epochs (within-subject correlations) increase the number of data points used to calculate concordance, and should therefore yield more reliable estimates of concordance. As concordance can be calculated two ways—within-subjects correlations representing the number of measurements of concordance taken for each participant or between-subjects correlations representing the number of subjects in the study—we examined number of data points separately for each type of correlation.

Participant Characteristics

Monthly fluctuations in reproductively-related hormones may be related to subjective-genital agreement among women. There is some evidence that estrogens are related to sexual response (Heiman, Rowland, Hatch, & Gladue, 1991) and that androgens can influence female subjective and genital sexual arousal (Tuiten, van Honk, Bernaards, Thijssen, & Verbaten, 2000, reported positive effects; however, Apperloo et al., 2006, reported null effects) and genital responsiveness (Slob et al., 1991). Because menstrual-cycle effects have been demonstrated for other processes related to sexual interest (e.g., Gangestad & Cousins, 2001; Gangestad, Garver-Apgar, Simpson, & Cousins, 2007), and women report their greatest interest in sex at mid-cycle (van Goozen, Wiegant, Endert, & Helmond, 1997), it is plausible that concordance may vary across the menstrual cycle because of the effects of hormone fluctuations on both subjective and genital responses (Schreiner-Engel et al., 1981). Few studies, however, have controlled for cycle variability among female participants. An indirect means of assessing the impact of cycle variation on concordance is to compare estimates from women who are not naturally cycling, such as women using oral contraceptives, with women who are naturally cycling.

The Gender Difference is Due to Differences in Learning and Attention

We may find that methodological moderators do not adequately explain the gender difference in agreement between subjective and genital sexual responses. If this is the case, then we need to consider other factors to explain the gender difference in concordance.

A learning approach explains the gender difference in subjective-genital agreement as a result of differential experiences, the sources of which are at least threefold. First, men have an obvious external cue of their genital response by having a penis that they can see when unclothed and feel when it presses against the body or against clothing during erection. Women’s genital responses, however, are hidden from view and produce less prominent somatosensory cues. A related point is that women’s less obvious genital response may hamper women’s familiarity with their genital anatomy (Gartrell & Mosbacher, 1984). Second, a plethora of negative cultural messages regarding female genitalia and menstruation may pair feelings of shame or embarrassment with genital sensations for women (Steiner-Adair, 1990). Third, women masturbate less often than men (Oliver & Hyde, 1993); masturbation may be one of the best activities for learning about one’s genitals and sexual responding, and pairing positive feelings with sexual activities. For example, prior research has shown that women who masturbate more frequently tend to report higher subjective sexual arousal (Laan & Everaerd, 1995a, 1995b) and show greater concordance (Laan et al., 1993).

Following a similar logic, older participants would be expected to produce greater subjective-genital agreement, especially older female participants, because they have more experience with attending to their genital sensations across different sexual experiences. Consistent with this hypothesis, Brotto and Gorzalka (2002) found that age was positively correlated with subjective-genital agreement among older pre-menopausal women. However, the effect of age may not be linear because there may also be a cohort effect on familiarity and comfort with one’s genitals and with masturbation, such that much older women would show lower rather than higher concordance. If this particular hypothesis is correct, then we would expect a curvilinear relationship between age and concordance among women. Indeed, Brotto and Gorzalka reported that concordance estimates from older post-menopausal women (Mage, 56 years) were lower than those of older pre-menopausal women (Mage, 48 years).

If cultural and anatomical differences have reduced women’s awareness of their genitals, directing their attention back to their genitals should improve concordance; similar effects should be observed for men. For women, however, the data do not support this idea. Merrit, Graham, and Janssen (2001) found that correlations between sexual feelings and genital sexual arousal remained low even when women were asked to estimate their genital responses during erotic stimulation. Similarly, Cerny (1978) found that, even when women received feedback about their level of vaginal engorgement, correlations were low and statistically nonsignificant. Conscious efforts of women to monitor their genital responses may thus not enhance concordance. If learning or attention do not explain the gender difference in concordance, then perhaps differences in information processing may contribute.

The Gender Difference is Due to Differences in Conscious and Unconscious Processing of Sexual Stimuli and Regulation of Genital Arousal

Another way in which the gender difference in concordance could manifest is through differential processing of sexual stimuli. This extension of information processing theory to sexual response predicts that gender differences in concordance emerge because of gender differences in the relative contribution of conscious (cognitively appraised or controlled, explicit) and unconscious (automatic, implicit) processes associated with sexual response (Spiering, Everaerd, & Laan, 2004): Because of men’s more prominent genital anatomy, automatic processes play a greater role in their experience of sexual arousal, resulting in greater concordance; for women, the meanings generated by sexual stimuli may have a greater influence on their subjective appraisal. In support of this notion, Laan (1994) showed that women’s positive appraisal of sexual stimuli was positively correlated with subjective sexual arousal. Additionally, women may report greater negative affect when presented with sexual stimuli, may not appraise some sexual stimuli as “sexual,” or may edit their self-report of feeling sexually aroused because of socially desirable responding (Laan & Janssen, 2007; Morokoff, 1985).

Cognitive models of sexual response and dysfunction propose that positive affect directs attention to erotic stimuli, thereby increasing sexual response, whereas negative affect interferes in the processing of sexual cues, resulting in lower sexual response (see Barlow, 1986). Lower concordance among women may reflect their experience of negative affect while watching the conventional, commercially available erotica that is primarily produced for men and typically used in psychophysiological studies. If information processing theory is correct, then using stimuli that produce less negative affect and more positive sexual appraisals may influence women’s subjective-genital agreement. For example, women reported greater sexual arousal to films that were female-centered, instead of typical commercially available sexual films, but did not show greater genital response to female-centered films (Laan, Everaerd, van Bellen, & Hanewald, 1994). Moreover, Laan et al. (1994) demonstrated that women reported greater negative affect to typical commercial erotica and greater positive affect to female-produced erotica.

Female-centered sexual films are characterized by depictions of women as sexual initiators, a focus on a woman’s pleasure, and sexual interactions that are often presented in the context of an intimate relationship between the actors. In contrast, typical commercially-available sexual films tend to focus instead on men as the sexual initiators, the man’s pleasure, and anonymous or casual sexual interactions. Studies that use commercial erotica would therefore be expected to decrease subjective sexual arousal (but not genital response) for women, and so we predict that presenting female-centered erotica would increase subjective-genital agreement among women.

More sexually explicit stimuli elicit greater negative affect among women than among men (Laan & Everaerd, 1995a, 1995b) and women typically have less exposure to sexually explicit materials than men (Hald, 2006). According to information processing theory, sexually explicit stimuli may impede the subjective experience of sexual arousal in women because these films elicit negative affect, and negative affect competes or interferes with the positive affect that usually underlies sexual arousal. Alternatively, more sexually explicit stimuli may, in fact, increase concordance among women, compared with less explicit or erotic stimuli because sexually explicit stimuli evoke greater increases in genital vasocongestion (Heiman, 1980) and, based on signal detection theory, one would expect that detection of a physiological event is dependent upon greater change in that physiological process (Laan, Everaerd, van der Velde, & Geer, 1995).

Asking participants to sexually fantasize might be expected to decrease negative affect, because most participants would be expected to imagine sexual content that they consider to be sexual and enjoyable and, therefore, would experience less negative and more positive affect. This is expected to be true even if some participants have difficulty in fantasizing in the laboratory. Using fantasy should increase women’s subjective-genital agreement to a greater extent than for men if this supposition is correct. In contrast, exposure to conventional sexually explicit materials targeted at a male audience where relatively little attention is paid to contextual factors would be expected to decrease subjective sexual arousal and thereby decrease subjective-genital agreement among women.

If negative affect interferes with positive appraisal of sexual stimuli among women, then experimental instructions would also be expected to have an impact on subjective-genital agreement. Conditions involving instructions to focus on one’s genital sensations should reduce attention to the negative aspects of the sexual stimulus, increase attention to physiological sensations, and thereby increase subjective-genital agreement in women and reducing the observed gender difference. Alternatively, attending to one’s genital sensations may increase the participant’s self-consciousness and interfere with concordance.

Finally, simultaneous assessment of subjective sexual arousal should result in higher subjective-genital agreement compared to post-trial assessment, because there is much less time for conscious processing of sexual stimuli, such as cognitive interference from negative affect, to take place in the moments after the trial has ended. This hypothesis does not require conscious processing of the sexual content, as interference could involve unconscious (automatic) processes. Because cognitive interference due to negative affect is expected to be greater for women, we predicted that the impact of simultaneous assessment on concordance should be larger for women than for men.

An off-shoot of information processing theory is that sexually dysfunctional participants, whether male or female, are expected to produce lower subjective-genital agreement than sexually functional participants. Models of sexual dysfunction propose that sexually dysfunctional individuals differ by having more negative cognitions and more negative affect in response to sexual stimuli. Lower concordance among sexually dysfunctional persons may reflect an absence of sexual feelings while experiencing genital responses, as has been demonstrated among women with sexual arousal disorder (Laan et al., 2008), or feeling sexually aroused but not experiencing the expected changes in genital vasocongestion, as in men who have erectile difficulties (Barlow, 1986). The potential for concordance to vary with sexual functioning has been demonstrated in studies of sexually dysfunctional women; women with sexual arousal problems report lower subjective sexual arousal to sexual stimuli in the laboratory, but do not show significantly lower genital responses when compared to women without sexual arousal problems (e.g., Laan et al., 2008; Morokoff & Heiman, 1980).

Summary of Study Hypotheses

We propose the following hypotheses regarding potential moderators, distinguishing between those that test methodological explanations for variation in subjective-genital agreement and those that test theoretically-derived explanations for low concordance in women. We first hypothesize that a reliable, overall gender difference in concordance will be observed, followed by 10 hypotheses based on methodological considerations and 5 predictions based on theoretical considerations.
  1. 1.

    There will be an overall gender difference in subjective-genital agreement, with men producing higher concordance estimates than women;

     
  2. 2.

    Men will show greater concordance for visual sexual stimuli and women will show greater concordance for nonvisual sexual stimuli;

     
  3. 3.

    Including a variety of different stimulus categories or modalities of sexual stimuli will yield higher subjective-genital correlations in both genders;

     
  4. 4.

    Higher subjective-genital agreement is expected from studies with more stimulus trials in both genders;

     
  5. 5.

    Longer stimuli will produce greater subjective-genital correlations in both genders;

     
  6. 6.

    Higher subjective-genital agreement is expected in women when subjective responses are recorded contiguously with the recording of genital responses, compared to post-trial assessments;

     
  7. 7.

    Higher subjective-genital agreement is expected for subjective response that is defined as perception of genital changes versus feeling sexually aroused in both genders;

     
  8. 8.

    Thermography will yield higher estimates of concordance than vaginal photoplethysmography in women;

     
  9. 9.

    VBV measurement of genital vasocongestion will show greater concordance than VPA in women;

     
  10. 10.

    Concordance calculated using within-subjects correlations will be higher than those calculated between-subjects in both genders;

     
  11. 11.

    Concordance will improve with the number of data points used to calculate the correlation in both genders;

     
  12. 12.

    Subjective-genital agreement will be more strongly and positively correlated with age among women than among men;

     
  13. 13.

    Samples of women receiving sex hormones through oral contraceptives will produce lower correlations than samples of women who are naturally cycling;

     
  14. 14.

    Women are expected to show greater subjective-genital agreement when presented with female-centered stimuli, while men will show no difference or might even show less subjective-genital agreement, in comparison to typical commercial sexual content;

     
  15. 15.

    Women will show higher concordance for erotic versus sexually explicit films. Men will show the opposite pattern;

     
  16. 16.

    Non-clinical samples of sexually functional participants will produce higher estimates of subjective-genital agreement than clinical samples of sexually dysfunctional participants in both genders.

     

Method

Studies were identified by searching major computerized reference databases (PsycInfo, Medline, PubMed) and by examining the reference lists of relevant studies. The following were the search terms employed, with asterisks indicating variations (e.g., plethy* would include both plethysmograph and plethysmography): vaginal and sexual arousal; plethy* and (subjective or self-report*); plethy* and sexual arousal; photoplethy*; penile and (subjective or self-report); penile and sexual arousal; phallom* and (subjective or self-report); phallo* and sexual arousal; subjective and (physiolog* or psychophysiolog*); and (subjective and genital and arousal).

We included studies published in English and available in peer-reviewed journals, books or book chapters, theses, and dissertations. We did not include unpublished studies (e.g., unpublished manuscripts, conference presentations) because their data could not be easily obtained, verified, or examined by readers. The possibility of a publication bias was examined using a funnel graph. We note that concordance was rarely the main focus of the selected studies, and was instead reported as part of the statistical analyses that were conducted.

Some data sets were reported in more than one publication. In these cases, we coded the publication representing the largest amount of data (e.g., the publication reporting on the largest sample size). Data collection ended in December 2007.

Selection Criteria for Inclusion in the Meta-Analysis

Studies were included in this meta-analysis if they reported data from which the correlation between a self-reported and genital measure of sexual arousal in response to a specified sexual stimulus could be obtained and if they met several other criteria, as explained below.

Criterion 1: Self-Report Measure of Sexual Arousal

Studies had to employ a clearly specified measure of self-reported sexual arousal or subjective estimate of genital response. These included Likert-type ratings, ratings made with visual analog scales, estimates of percentage of full response, or moving levers or other devices to indicate sexual arousal. Subjective sexual arousal and estimated genital response were coded separately.

Criterion 2: Physiological Measure of Genital Arousal

A specific measure of genital sexual arousal had to be employed. For women, genital measures of sexual arousal included vaginal photoplethysmography or thermography. For men, genital measures of sexual arousal included circumferential assessments using mercury-in-rubber, indium-gallium, or mechanical (Barlow) strain gauges, volumetric devices, or thermography.

Criterion 3: A Well-Specified Sexual Stimulus

Self-reported and genital arousal had to be measured in response to some form of psychological sexual stimulation, including self-generated or guided sexual fantasy, exposure to visual sexual stimuli (pictures or film, with or without audio accompaniment), or descriptions of sexual interactions either read by the participant or presented as an audio recording.

Criterion 4: Correlation Coefficient Between Self-Reported and Genital Measures of Sexual Arousal

A correlation coefficient between self-reported and genital measures of sexual arousal had to either be reported in the published article, book chapter, book, thesis, or dissertation, or available from the primary authors. Correlations that were reported as statistically nonsignificant without reports of actual coefficient values were coded as zero.

For estimates of concordance between self-reported feelings of sexual arousal and genital response (hereinafter, Rsub), correlations for one male sample and sixteen female samples were described by the original authors as not statistically significant and therefore coded as zero. For estimates of concordance between self-reported genital response and actual genital response (hereinafter, Rgen), correlations for one male sample and two female samples were described as not statistically significant and therefore coded as zero. In one case, correlations were estimated from study figures (Schaefer, Tregerthan, & Colgan, 1976).

Studies Included in the Meta-Analysis

We identified 132 studies published between 1969 and 2007 reporting genital and self-reported sexual arousal responses, with total sample sizes of 2,505 women and 1,918 men. Of these studies, 70 reported on female samples, 49 reported on male samples, and 13 reported on both male and female samples.

Moderator Variables

As discussed earlier, we focused first on methodological variables that might be moderators of the agreement between self-reported and genital measures of sexual arousal on the basis of previous research. We then examined variables that might be potential moderators under the logic of hypotheses derived from sexual selection, information processing, and learning theories.

We were constrained in our variable selection by the study descriptions that were available in the published reports. For example, most of the studies that were included in this meta-analysis either did not record or report details about the participants’ sexual histories, or reported them in idiosyncratic ways that made meta-analysis impossible, so we could not code for sexual experience as a moderator of concordance, even though this would provide a clear and more direct test of our hypothesis about the impact of learning on concordance. Instead, we used participant age as a proxy for sexual experience, because older participants would have more sexual experiences, on average, than younger participants.

The study variables are described below, organized in a rational fashion, according to how the data were coded, that does not necessarily correspond to the study hypotheses. The results, however, are presented in the order in which the hypotheses are listed.

Sample Characteristics

Participant Age

Average participant age for the sample was recorded. If only an age range was provided, the mid-point of this range was selected to represent the sample’s average age.

Study Population

The population from which the study sample was recruited was coded as basic (sexually functional volunteers and pre-menopausal, if female), sexually dysfunctional persons, post-menopausal women, medical patients, sexual offenders, or clinical/sexological patients.

Hormonal Status

This moderator was coded only for female samples. Samples were coded as using oral contraceptives, no use of exogenous hormones (this was only coded when it was clearly stated that participants were not using oral contraceptives or receiving hormone replacement therapy), or unspecified.

Stimulus Characteristics

Stimulus Modality

Stimuli were assigned to the following categories: video/film presented with audio; video/film presented without audio; audiotaped description of sexual interaction; sexual fantasy; sexual text read by subject; still pictures; and still pictures presented with audiotaped descriptions of sexual interaction. Combinations of stimulus modalities were also recorded when correlations were reported in this manner (e.g., across video, fantasy, and still picture stimuli; across both video and audiotape stimuli; across both audiotape and fantasy stimuli).

Stimulus Duration

The total duration of stimulus presentation was recorded in seconds. This was equal to the duration of the stimulus if the study used a single presentation. If multiple sexual stimuli were used, and the correlation between self-reported and genital measures was reported across these stimuli, the total duration of all sexual stimuli was recorded in seconds.

Stimulus Sexual Explicitness

Stimulus content was coded for the explicitness of the sexual interactions presented or described. Stimuli were considered to be explicit if they included clear depictions of genital interactions during oral, anal, or vaginal intercourse. Stimuli showing sexual interactions that did not include depictions of oral, anal, or vaginal intercourse (e.g., a stimulus depicting kissing, touching, and other foreplay or included oral, anal, or vaginal intercourse scenes without clear depictions of the genitals) were coded as erotic.

Female-Centered Stimulus

Stimulus content was coded as female-centered, not female-centered, or missing (e.g., a fantasy stimulus). A stimulus was considered female-centered if it was explicitly described as female-centered or made for female audiences.

Number of Trials

The total number of stimulus trials was coded.

Stimulus Variability

The correlations reported by each study sample were coded as demonstrating stimulus variability if the correlation coefficient was calculated from a set of sexual stimuli that had at least two kinds of stimulus content (e.g., preferred and nonpreferred gender, preferred and nonpreferred activity) or at least two kinds of stimulus modality (e.g., audiovisual and sexual fantasy, still pictures and text).

Self-Reported Sexual Arousal

Two different measures of self-reported sexual arousal were coded. The first was self-reported subjective experience of sexual arousal (e.g., feeling “sexually excited,” “sexually aroused,” or “horny”). We refer to this as Rsub throughout this article. The second was a self-reported estimate or perception of genital response (e.g., a man estimating his erection during a stimulus presentation; a woman rating the intensity of felt genital sensations during a trial). We refer to this as Rgen throughout this article.

Timing

The timing of self-reported arousal assessments was coded as immediately after a stimulus presentation, contiguous with stimulus presentation, or after all sexual stimuli had been presented (end of session).

Genital Sexual Arousal Measurement

Apparatus

The methods of measuring genital sexual arousal differed between the sexes. For women, measures of genital sexual arousal included vaginal photoplethysmography (coded as VPA or VBV, depending on how the data were represented) or thermography (both pelvic and labial temperature changes). For men, measures of genital response included circumferential assessments using mercury-in-rubber, indium-gallium, and mechanical (Barlow) strain gauges, assessment of changes in penile volume, and thermography of the pelvic region.

Statistical Methods

These potential moderators included the type of correlation calculated and the number of data points used to calculate the correlation coefficient.

Type of Correlation

The type of correlation coefficient calculated in each study was coded. Within-subjects correlations address agreement across individual variation in responding, while between-subjects correlations address the agreement across group variation in responding. Mixed correlations are calculated across both participants and stimulus conditions and therefore represent a combination of both within-subjects and between-subjects data points.

Number of Data Points

For within-subjects correlations, the number of data points refers to the number of observations of subjective and genital response for each individual. For between-subjects correlations, the number of data points refers to the number of participants included in the analysis. For mixed correlations, the number of data points refers to the number of participants multiplied by the number of observations per participant.

Inter-Rater Reliability

The study coding was completed by the first and second authors. Twelve studies were randomly selected from the final set of studies and coded by both the first and second authors. Inter-rater coding was limited to study conditions representing basic participants, responses to preferred sexual stimuli, no experimental manipulations, and correlations calculated using average self-reported and genital sexual arousal responses (if more than one method of reducing data was reported).

Inter-rater reliability values ranged from good to excellent. Kappas for categorical variables ranged from 0.81 to 1.00, and Spearman’s rho for ordinal or interval variables ranged from .78 to 1.00. Kappa could not be calculated in some cases because the cross-tabulations of the two ratings were not symmetric. Inspection of these asymmetric categorical variables indicated percentages of agreement from 77% to 100%.

The entire data set was checked for errors by the fifth author, who was, at that time, masked to the study hypotheses. A total of 205 discrepancies were found, representing an average of 1.6 discrepancies per study (ranging from 0 to 24 discrepancies), and 0.7% of all possible cells. Discrepancies were resolved in discussions with the first author, and consultation with the second author if necessary.

Effect Size and Analytical Strategy

We used Pearson r as our index of effect size, representing the correlation between subjective and genital sexual arousal (or the correlation between perception of and actual genital arousal, in the case of Rgen). We first examined the overall gender difference by comparing the r obtained for all correlations reported for men and for women. We then examined the gender difference in r for each independent sample. For this analysis, we averaged across all correlations obtained for each sample, using Fisher’s r to z transformation, and transforming back to r. We next examined the correlation between subjective and genital sexual arousal for a selected subset of independent samples (defined below). Finally, we examined subjective-genital agreement in the 13 studies that directly compared male and female samples. We conducted this nested series of analyses to determine if a gender difference in concordance could be reliably detected regardless of how studies were selected.

For analyses involving moderator variables coded in a discrete fashion, an average correlation for each independent sample, at each level of the moderator variable, was calculated. For analyses involving moderator variables coded in a continuous fashion, a correlation was calculated between the putative moderator and the concordance estimate obtained using all relevant independent samples.

Ninety-five percent confidence intervals were calculated to examine gender and moderator differences, and to test if correlations were significantly different from zero. The rule for determining whether a categorical variable difference was statistically significant was that one mean had to be outside of the 95% confidence interval of the other mean; for example, there was a significant difference between male samples and female samples if the mean subjective-genital correlation of either gender did not fall within the 95% confidence interval of the other gender. All analyses were weighted by sample size, so that studies with larger samples had more influence on the average subjective-genital correlation. The Fisher z inverse variance method was used to calculate aggregate correlations, with a random-effect model. Comprehensive Meta-Analysis v1.0.25 (Biostat Inc., Englewood, NJ) and SPSS version 17.0 (SPSS Inc., Chicago, IL) were used for all statistical analyses.

Results

Table 1 summarizes each study included in the meta-analysis. Details included sample characteristics, measures of subjective and genital sexual arousal, and the average correlation between these measures of sexual arousal.
Table 1

Summary of studies included in meta-analysis

Study

Sample description

Measures

Study design

Stimuli

Correlation with self-reported sexual arousal

Correlation with perception of genital arousal

Female

Male

Genital

Subjective

Female

Male

Female

Male

Studies reporting within-subjects correlations

Abel, Blanchard, Murphy, Becker, and Djenderedjian (1981)

21 volunteers (mean age = 24)

Barlow gauge

Likert

Compared two methods of quantifying penile response: percent of full erection and AUC (area under penile response curve).

12 explicit presentations of deviant and non- deviant film and audio.

.74/.67 (% erection/ AUC)

.82/.82 (% erection/AUC)

6 outpatients (mean age missing)

24 explicit presentations of homosexual and heterosexual film, audio, and fantasy.

.68/.74

.88/.83

8 sex offenders (mean age missing)

24 explicit presentations of deviant and nondeviant film, audio, and fantasy.

.57/.56

.78/.75

Abrahamson, Barlow, Beck, and Athanasiou (1985)

 

10 volunteers (mean age = 39.5)

Barlow gauge

Lever

Effects of distraction and stimulus intensity on functional and dysfunctional men.

3 explicit film clips.

.61

10 with erectile dysfunction (mean age = 43.6)

Correlations reported across distraction and film conditions.

.32

Bancroft (1971)

 

25 combined heterosexual & homosexual sexology patients (mean age missing)

Strain gauge

Likert

Sexual response to preferred and nonpreferred sexual stimuli

10 slides (5 nude male and 5 nude female).

.74

Barlow, Sakheim, and Beck (1983)

 

12 students (mean age = 26.3)

Barlow gauge

Lever

Effect of anxiety induced by shock threat. Correlation reported for no-shock condition.

1 explicit film.

.68

Beck and Barlow (1986)

 

12 with secondary erectile dysfunction (mean age = 43.8)

Barlow gauge

Lever

Effect of attentional focus and anxiety (induced by shock threat).

4 explicit heterosexual films of foreplay.

.70 (no shock threat)

12 volunteers (mean age = 40.9)

Correlations reported across focus conditions (sensate and spectator focus) and group.

.52 (shock threat)

Beck, Barlow, and Sakheim (1983)

 

8 volunteers (mean age = 35)

Barlow gauge

Lever

Effects of attentional focus (self versus partner).

6 explicit black-and-white heterosexual films of foreplay.

.23

8 with sexual dysfunction (mean age = 42)

Correlations reported across focus conditions.

.13

Beck, Barlow, Sakheim, and Abrahamson (1987)

 

16 volunteers (mean age = 24)

Barlow gauge

Lever

Effects of shock threat, selective attention, thought content and affect.

4 explicit heterosexual audiotaped clips.

.75

Correlations reported across shock conditions.

Chivers et al. (2007)

27 heterosexual students & volunteers (mean age = 22.3)

27 heterosexual students & volunteers (mean age = 24)

VPA/Strain gauge

Lever

Gender and orientation differences in response to sexual activities vs. gender of actors in sexual films.

16 film clips (mating bonobos, nude exercise, masturbation & copulation clips).

.51

.82

20 homosexual students & volunteers (mean age = 28)

17 homosexual students & volunteers (mean age = 25.1)

Correlations reported across several stimuli.

.56

.85

Cranston-Cuebas et al. (1993)

 

10 volunteers (mean age = 43.9)

Barlow gauge

Lever

Effects of misattribution manipulation using placebo.

3 explicit heterosexual films depicting 2 females, 1 male.

.23

10 with secondary erectile dysfunction (mean age = 48.8)

  

Correlations reported across manipulation conditions (detraction, enhancement, neutral).

.45

Dekker and Everaerd (1988)

48 students (mean age = 22)

48 students (mean age = 23)

Barlow gauge

Likert

Attentional effects on sexual arousal.

15 explicit heterosexual slides, one audiotaped narrative, and sexual fantasy.

.43

VPA

Correlations averaged across stimuli and focus conditions (focus on situation/action and focus on sexual response).

.37

VBV

.16

Farkas, Sine, and Evans (1979)

 

30 students (mean age = 26.4)

Strain gauge

Lever

Effects of distraction, performance demand and stimulus explicitness on sexual arousal.

1 explicit or nonexplicit (clothed, heterosexual sensual interaction) black-and-white film without audio.

.49 (attention)

.35 (distraction)

Korff and Geer (1983)

10 students (mean age missing)

 

VPA

Visual-auditory scale

Relationship between focus condition and concordance.

10 erotic heterosexual slides.

.48/.47/.69 (scale/light/tone)

12 students (mean age missing)

Correlations for 3 subjective scales: 5-point rating, light, and sound (tone).

10 erotic heterosexual slides.

.87/.82/.90

Genital focus group.

14 students (mean age missing)

Non-genital focus group.

10 erotic heterosexual slides.

.86/.82/.79

Laan and Everaerd (1995b)

16 students (mean age = 21)

 

VPA

Lever

Habituation of sexual arousal.

11 slides, depicting heterosexual sex and nude or semi-nude male or female models.

.38

19 students (mean age = 20.5)

21 explicit heterosexual films (female-centered).

.28

20 students (mean age = 20.5)

21 uniform presentations of a cunnilingus scene.

.24

Laan, Everaerd, van der Velde et al. (1995)

17 students (mean age = 20)

 

VPA

Lever

Association between subjective and genital arousal.

21 presentations of explicit, heterosexual film clip (female-centered).

.26

14 students (mean age = 20)

21 explicit heterosexual film clips (female-centered). Clips were different but showed the same content.

.30

19 students (mean age = 20)

21 explicit heterosexual film clips (female-centered), increasing in sexual intensity.

.61

Mavissakalian et al. (1975)

 

6 heterosexual students (mean age = 22.6)

Barlow gauge

Lever

Responses to erotic stimuli in homosexual and heterosexual males.

16 explicit black-and-white film clips, depicting heterosexual activity, single female activity, homosexual male or lesbian activity.

.69/.74 (within/between)

6 homosexual sexology patients (mean age = 21.5)

Correlations reported across two sessions as within-subjects and between-subjects.

.70/.57

Meuwissen and Over (1992)

10 students (mean age = 26.9)

 

VPA/VBV

Likert

Correlation between subjective and genital arousal across phases of menstrual cycle.

9 explicit film clips and 15 fantasies. Film varied in content and target stimuli. Fantasy varied in content and included atypical sex.

.62 /.69 (menstrual)

.72/.69 (post-menstrual)

Correlations calculated across all 24 stimulus trials for each phase.

.60/.68 (luteal)

.69/.73 (pre-menstrual)

Rellini et al. (2005)

22 volunteers (mean age = 27)

 

VPA

Likert

Relationship between physiological and subjective arousal in women.

1 explicit heterosexual film.

−.08

−.13

Rowland and Heiman (1991)

 

9 volunteers (mean age = 36.1)

Strain gauge

Likert

Arousal before (Time 1) and after (Time 2) sex therapy program.

2 explicit heterosexual audiotapes, narrated by female.

.74/.81 (time 1/time 2)

.80/.72 (time 1/time 2)

9 with sexual dysfunction (mean age = 41.8)

Correlations reported across sensate focus and instructional demand.

Unstructured fantasy.

.61/.73

.67/.52

Rubinsky et al. (1985)

6 volunteers (mean age = 28)

10 volunteers (mean age = 28)

VPA or strain gauge

Likert

Testing validity of groin skin temperature.

1 explicit black-and-white heterosexual film.

.09

.43

VBV

.53

Thermography

.63

.31

Sakheim, Barlow, Abrahamson, and Beck (1987)

 

10 healthy volunteers (mean age = 38.1)

Barlow gauge

Lever

Distinguishing between psychogenic and organogenic erectile dysfunction.

1 explicit heterosexual film clip.

.72

10 with psychogenic sexual dysfunction (mean age = 44.6)

.50

10 with organogenic sexual dysfunction (mean age = 55.8)

.26

Strassberg, Kelly, Caroll, and Kircher (1987)

 

13 volunteers (mean age = 30)

Barlow gauge

Likert

Sexual arousal and premature ejaculation.

3 explicit film clips.

.54

13 with premature ejaculation (mean age = 33)

.49

Webster and Hammer (1983)

 

8 heterosexual volunteers (mean age = 27)

Barlow gauge

Likert

Thermographic measurement of arousal.

3 explicit heterosexual film clips (black and white).

.95

Thermography

.94

Wincze, Hoon, and Hoon (1977)

6 volunteers (mean age = 24.3)

VPA

Likert

Comparing cognitive and physiological responses.

17 explicit film clips, depicting heterosexual intercourse, group sex and single homosexual scene.

.41

Thermography

.27

Wincze et al. (1980)

8 volunteers (mean age = 22.2)

6 volunteers (mean age = 20.6)

VPA or Barlow gauge

Likert

Effects of subjective monitoring.

2 heterosexual film clips, depicting low arousal (kissing) and high arousal (intercourse).

.15

.69

Correlations reported across arousal conditions.

Wincze and Qualls (1984)

8 homosexual volunteers (mean age = 26)

8 homosexual volunteers (mean age = 26)

VPA or Barlow gauge

Likert

Sexual orientation and sexual arousal to preferred and nonpreferred sexual stimuli.

5 explicit films depicting female–male sex, male–male sex, female–female sex, group sex and neutral.

.69

.86

Wormith (1986)

36 combined sex offenders and non-sex offenders (mean age = 30.2)

Strain gauge

Likert

Physiological and cognitive aspects of deviant sexual arousal. Correlations reported across content conditions.

12 explicit slides, depicting adult male, adult female, child male, child female, heterosexual couple and neutral scene.

.53

Studies reporting between-subjects correlations

Abramson et al. (1981)

37 students (mean age = 28)

32 students (mean age = 28)

Thermography

Likert

Discriminant validity of thermography.

1 explicit story.

.70

.73

Adams et al. (1985)

24 students (mean age = 20.1)

 

VPA

Likert

Effect of cognitive distraction.

1 explicit heterosexual audiotape.

.37 (no distraction)

.74 (distraction)

Adams, Wright, and Lohr (1996)

 

64 students (mean age = 20.3)

Strain gauge

Likert

Homophobia and sexual arousal.

1 explicit heterosexual film.

.57

.64

1 explicit female–female film.

.63

.66

1 explicit male–male film.

.53

.64

Bach et al. (1999)

 

26 volunteers (mean age = 32.2)

Barlow gauge

Likert

False negative feedback and sexual arousal.

2 explicit heterosexual films with no audio.

.29 (film 1)

.28 (film 1)

−.16 (film 2)

.37 (film 2)

Basson and Brotto (2003)

34 post-menopausal with sexual dysfunction (mean age = 56.6)

 

VPA

Likert

Drug trial for sildenafil citrate.

1 explicit heterosexual film.

.19

.30

Bellerose and Binik (1993)

58 combined volunteers, hysterectomy & oophorectomy patients (mean age = 46)

 

VPA

Likert

Body image and sexuality in oophorectomized women.

1 explicit heterosexual film.

.23

.28

Lever

Correlations reported across two sessions.

.24

.28

Bernat, Calhoun, and Adams (1999)

 

34 students (mean age = 19.9)

Strain gauge

Likert

Arousal to consensual and nonconsensual sex.

2 explicit audiotaped clips paired with nude female slide.

.64

Both et al. (2004)

10 students (mean age = 22.6): Study 1

10 students (mean age = 22)

VPA or Barlow gauge

Likert

Sexual behavior (study 1) and responsiveness to stimuli (study 2) following laboratory-induced sexual stimulation.

1 explicit heterosexual film.

−.26

.45

.13

.60

24 students & volunteers (mean age = 24.5): Study 2

24 students & volunteers (mean age = 27)

−.02

.03

−.06

−.06

Both, Everaerd, Laan, and Gooren (2005)

28 students (mean age = 22)

19 students (mean age = 22)

VPA or Barlow gauge

Likert

Effects of dopamine on arousal in men vs. women.

2-min fantasy period (unstructured).

.13

.67

.16

.65

1 explicit heterosexual film.

.31

.49

.16

.68

Both, Van Boxtel, Stekelenburg, Everaerd, and Laan (2005)

26 students (mean age = 22.9)

 

VPA

Likert

Spinal reflexes and arousal to films of increasing intensity.

1 low-intensity heterosexual film (kissing).

−.23

.12

1 medium-intensity heterosexual film (kissing & caressing).

.30

.37

1 high-intensity heterosexual film (intercourse).

.20

.04

Bradford and Meston (2006)

38 volunteers (mean age = 25.4)

 

VPA

Likert

Effect of anxiety.

1 explicit heterosexual film.

.35

.32

Brauer et al. (2006)

24 volunteers (mean age = 26.6)

 

VPA

Likert

Arousal to coital vs. non-coital sex in women with dyspareunia.

1 explicit heterosexual film depicting oral sex.

.43/.25/.31 (oral/coitus/average)

.43/.37/.43 (oral/coitus/average)

48 with dyspareunia (mean age = 28.2)

Correlations reported for oral film, coitus film, or average across both films.

1 explicit heterosexual film depicting coitus.

−.14/.21/.03

.00/.27/.15

Brauer et al. (2007)

48 volunteers (mean age = 23.9)

 

VPA

Likert

Pain-related fear and dyspareunia.

1 explicit heterosexual film.

.18

.24

48 with dyspareunia (mean age = 25.9)

−.05

.12

Briddell et al. (1978)

 

48 heterosexual students (mean age = 22)

Strain gauge

Likert

Effects of alcohol and cognitive set.

1 explicit heterosexual audiotape (consensual).

.57

.73

1 explicit audiotape of forcible rape scenario.

.42

.56

Fantasy.

.55

.60

Briddell and Wilson (1976)

 

48 students (mean age = 20)

Strain gauge

Likert

Effects of alcohol and alcohol expectancy.

2 explicit heterosexual films.

.66

Correlations reported across four alcohol and two expectancy conditions.

Brotto and Gorzalka (2002)

25 pre-menopausal (mean age = 24.5)

 

VPA

Likert

Effects of hyperventilation on sexual arousal in pre- and postmenopausal women.

1 explicit heterosexual film.

.14

.19

21 pre-menopausal (mean age = 47.8)

Correlations reported across two sessions.

.42

.48

25 post-menopausal (mean age = 56)

.26

.30

Brotto et al. (2004)

30 volunteers (mean age = 23.4)

 

VPA

Likert

Patterns of sexual response in sexually dysfunctional women.

1 explicit heterosexual film.

−.38

−.22

31 with sexual dysfunction (mean age = 30.6)

.17

.08

Cerny (1978)

10 students given no feedback (mean age = 19.9)

 

VBV/VPA

Likert

Biofeedback and voluntary control.

1 explicit heterosexual film.

.72/.00 (nonsig)

10 students given accurate feedback (mean age = 19.9)

Correlations reported across 10 trials.

.00/.00 (nonsig)

10 volunteers given false feedback (mean age = 19.9)

.00/.00 (nonsig)

Chivers (2003)

69 heterosexual volunteers (mean age = 24.6)

39 heterosexual volunteers (mean age = 29.6)

VPA or strain gauge

Likert

Relationship between sexual arousal to preferred and nonpreferred sexual stimuli and sexual orientation.

3 explicit film clips depicting gay, lesbian, or heterosexual sex.

.36/.52/.41 (gay/lesbian/heterosexual)

.58/.48/.51

19 homosexual volunteers (mean age = 28.4)

29 homosexual volunteers (mean age = 32.7)

.57/.59/.49

.55/.67/.41

17 bisexual volunteers (mean age = 25.1)

30 bisexual volunteers (mean age = 29.6)

.39/.55/.56

.08/.48/.19

Danjou, Alexandre, Warot, Lacomblez, and Puech (1988)

 

10 volunteers (mean age = 23)

Strain gauge

Visual analog scale

Effects of apomorphine and yohimbine.

50 explicit slides.

.01

Elliott and O’Donohue (1997)

24 volunteers (mean age missing)

 

VPA

Likert

Effects of anxiety and distraction.

1 erotic heterosexual audiotape.

.02 (control)

2 erotic heterosexual audiotapes.

.16 (mean of high and low distraction)

Exton et al. (1999)

10 students (mean age = 24.8)

 

VPA

Likert

Effect of masturbation.

1 explicit heterosexual film.

.45

Geer, Morokoff, and Greenwood (1974)

14 students (mean age missing)

 

VPA/VBV

Likert

Development of device to measure vaginal blood volume.

1 explicit heterosexual film.

.00/.00 (nonsig)

George et al. (2006)

 

65 students & volunteers (mean age = 25.6)

Strain gauge

Likert

Effects of alcohol.

2 explicit heterosexual film clips.

.50

60 students & volunteers (mean age = 25)

.45

Gerard (1982)

10 mastectomy patients (mean age = 47)

 

VPA

Likert

Mastectomy and sexual functioning.

1 erotic heterosexual film.

.15

10 volunteers (mean age = 48)

 

.74

Graham, Janssen, and Sanders (2000)

27 students (90% heterosexual) (mean age = 26.9)

 

VPA

Visual/auditory scale

Effects of fragrance on arousal, mood and menstrual cycle.

3 explicit heterosexual films (2 female-centered).

−.17

−.18

Correlations reported across conditions.

3 min of unstructured fantasy.

−.16

−.09

Heiman (1977)

59 students (mean age = 19)

39 students (mean age = 19)

VPA or strain gauge

Likert

Male and female sexual arousal.

8 erotic heterosexual audiotapes and 4 unstructured sexual fantasies.

.56

.54

Heiman (1980)

55 married or unmarried volunteers (mean age = 30)

 

VPA

Likert

Physiological, affective and contextual correlates of sexual response.

1 explicit heterosexual film with no audio.

.32

1 explicit heterosexual audiotape, narrated by male.

.39

Heiman and Hatch (1980)

 

16 heterosexual volunteers (mean age = 35.4)

Strain gauge

Likert

Affective and physiological correlates of male sexual response.

1 explicit heterosexual audiotape, narrated by female.

.78

.82

Unstructured fantasy.

.66

.79

Heiman and Rowland (1983)

 

16 volunteers (mean age = 34)

Strain gauge

Likert

Effects of demand instructions on functional and dysfunctional men.

1 explicit heterosexual audiotape.

.67

.00 (nonsig)

1 explicit heterosexual audiotape.

.65

.00 (nonsig)

Fantasy.

.67

.00 (nonsig)

14 with sexual dysfunction(mean age = 39)

   

1 explicit heterosexual audiotape.

.55

.00 (nonsig)

1 explicit heterosexual audiotape.

.55

.60

Fantasy.

.83

.00 (nonsig)

Heiman et al. (2001)

12 pre- and post-menopausal (mean age missing)

 

VPA

Likert

Comparison between VPA, VBV and pelvic imaging (regional blood volume, RBV, and clitoral blood volume, CBV)

1 explicit film.

.65

.73

VBV

.47

.54

RBV

.51

.42

CBV

.50

.45

Hoon (1980)

23 volunteers (mean age = 26)

 

VPA/VBV

Likert

Effects of biofeedback.

3-min fantasy period.

.06/.26 (no feedback)

.14/.25 (feedback)

Islam et al. (2001)

6 with sexual arousal disorder (mean age = 40.4)

 

VPA

Likert

Drug trial for topical alprostadil USP. Correlations reported for placebo condition.

1 explicit film.

−.34

.43

Janssen, Vorst, Finn, and Bancroft (2002)

 

39 students (mean age = 23)

Rigiscan

Likert

Evaluate predictive value of self-report scales across low- (LD) and high-demand (HD) conditions, with or without distraction.

1 explicit heterosexual film (female-centered).

.55 (LD)

.66 (LD)

.62 (LD + distraction)

.82 (LD + distraction)

.27 (HD)

.47 (HD)

1 erotic film clip depicting coercive heterosexual interactions.

.62 (HD + distraction)

.70 (HD + distraction)

.54 (LD)

.60 (LD)

.63 (HD)

.77 (HD)

Julien and Over (1988)

 

24 students (mean age = 26)

Strain gauge

Likert

Habituation study across five modalities: film audiotape, fantasy, pictures, and text.

8 explicit heterosexual stimuli per modality, depicting activities progressing from couple undressing to male ejaculation.

.27 (film)

.52 (audiotape)

.76 (fantasy)

.62 (stills)

.54 (text)

Kukkonen et al. (2007)

10 students (mean age = 20.8)

10 students (mean age = 21.4)

Thermography

Likert

Validity of thermography among men and women. Both within- and between-subjects correlations reported.

1 explicit heterosexual film.

.40/.60 (between-subjects/within-subjects)

.71 (between)

Laan et al. (1993)

46 students (mean age = 20.7)

 

VPA

Lever and Likert

Performance demand vs. no demand and sexual arousal. Subjective sexual arousal assessed during stimulus (DS) with a lever and post-stimulus (PS) with a Likert scale.

1 explicit heterosexual film.

.26/.39 (DS-demand/PS-demand)

.28/.46 (DS-none/ PS-none)

1 2-min unstructured fantasy period.

.00/.00 (DS-demand/PS-demand, nonsig)

.00/.00 (DS-none/ PS-none, nonsig)

Laan et al. (1994)

47 students & volunteers (mean age = 25)

 

VPA

Likert

Effects of male- vs. female-centered stimuli.

1 explicit heterosexual film (female-centered).

−.05

.09

Correlations averaged over film order.

1 explicit heterosexual film (male-centered).

.16

.32

Laan, Everaerd, and Evers (1995)

49 students (mean age = 22.3)

 

VPA

Visual analog scale

Response specificity and construct validity of VPA and VBV.

1 explicit heterosexual film (female-centered).

.33

.00 (nonsig)

1 explicit heterosexual film, depicting sexual threat.

.47

.31

VBV

1 explicit heterosexual film (female-centered).

.38

.11

1 explicit heterosexual film, depicting sexual threat.

.45

.00 (nonsig)

Laan, Everaerd, van Berlo, and Rijs (1995)

13 volunteers (mean age = 22.5)

 

VPA

Likert

Effects of mood.

1 explicit heterosexual film clip, and 5 min of unstructured fantasy.

.28/−.06 (film/ fantasy)

Laan et al. (2001)

38 post-menopausal (mean age = 54)

 

VPA

Likert

Effects of tibolone.

1 erotic heterosexual film clip, depicting foreplay (female-centered).

.37

.41

1 explicit heterosexual film clip (female-centered).

.56

.35

1 3-min unstructured fantasy period (fantasy 1).

.52

.33

1 3-min unstructured fantasy period (fantasy 2).

.42

.51

Laan et al. (2002)

12 volunteers (mean age = 23.5)

 

VPA

Likert

Sildenafil drug trial. Correlations reported for placebo condition.

1 explicit heterosexual film.

.32

.26

Lake Polan et al. (2003)

20 volunteers (mean age = 24.9)

 

VPA

Likert

Sexual arousal in women to female-centered films.

1 explicit heterosexual film with sound.

.16

1 explicit heterosexual film without sound.

.16

Lange, Wincze, Zwick, Feldman, and Hughes (1981)

 

24 students (mean age = 22.6)

Barlow gauge

Lever

Effects of performance demand and epinephrine. Correlation for no drug/demand condition.

1 explicit heterosexual film clip (black and white, no audio).

.68

Lohr, Adams, and Davis (1997)

 

24 students (mean age = 19)

Strain gauge

Likert

Sexual arousal in sexually coercive vs. non-coercive men.

7 explicit heterosexual audiotape clips, depicting consent, verbal threat, rape, or nonsexual aggression, with nude female slide.

.76

.87

24 students; mix of controls & sexually coercive (mean age = 19.5)

   

7 explicit heterosexual audiotape clips, depicting consent, verbal threat, rape or nonsexual aggression.

.86

.85

Malamuth and Check (1980)

 

71 students (mean age missing)

Strain gauge

Likert

Sexual arousal to rape depictions.

3 explicit audiotape clips, depicting consensual or non-consensual sex.

.54

69 students (mean age missing)

1 explicit audiotape clip, depicting non-consensual sex.

.31

McCall and Meston (2007)

16 volunteers (mean age = 27.3)

 

VPA

Likert

Effects of false and negative feedback among sexually dysfunctional women.

1 explicit heterosexual film.

.51/.27 (positive feedback/negative feedback)

.39/.21 (positive feedback/negative feedback)

15 with sexual dysfunction (mean age = 35.3)

.28/.03

.44/.05

McConaghy (1969)

 

37 homosexual men (mean age = 26.5)

Volumetric

Method not reported

Responses after aversion therapy. Correlation reported across male and female stimuli.

20 explicit film clips.

.00 (nonsig)

Messé and Geer (1985)

30 students (mean age = 20)

 

VPA

Likert

Effect of Kegel exercise on sexual arousal.

4-min unstructured fantasy periods.

.00 (nonsig)

Correlation reported across all sessions and treatment groups.

Meston and Gorzalka (1995)

35 students (mean age = 24.6)

 

VPA/VBV

Likert

Effect of exercise.

1 explicit heterosexual video clip.

−.19/−.23

Correlation reported for no exercise condition.

Meston, Gorzalka, and Wright (1997)

15 volunteers (mean age = 25.6)

 

VPA/VBV

Likert

Effects of clonidine and sympathetic activation via exercise.

Correlations reported for placebo conditions.

1 explicit heterosexual film clip.

.25/.19 (no exercise)

.27/.56 (exercise)

Meston and Heiman (1998)

20 students (mean age = 25.8)

 

VPA

Likert

Effects of ephedrine sulfate on sexual arousal.

1 explicit heterosexual film clip.

.23

.14

Correlations reported for the placebo condition.

 

 

Meston and Worcel (2002)

24 post-menopausal with sexual arousal disorder (mean age = 53.7)

 

VPA

Likert

Drug trial for yohimbine plus l-arginine glutamate.

1 explicit heterosexual film (female-centered).

−.01

−.03

Meston (2004)

15 hysterectomy patients with fibroids (mean age = 41.4)

 

VPA

Likert

Effects of hysterectomy and exercise.

1 explicit heterosexual film.

.06/.04 (no exercise/exercise)

.17/.39 (no exercise/exercise)

−.37/.12

−.31/−.07

17 with fibroids only (mean age = 40)

Meston (2006)

16 volunteers (mean age = 28.9)

 

VPA

Likert

State and trait self-focused attention in sexually functional vs. dysfunctional women.

1 explicit heterosexual film.

.31/.31 (no focus/self-focus)

16 with sexual dysfunction (mean age = 32.3)

.11/−.25

Meston and McCall (2005)

13 students (mean age = 26.6)

 

VPA

Likert

Dopamine and norepinephrine responses to erotic stimuli in sexually functional vs. dysfunctional women.

1 explicit heterosexual film.

.14

.15

9 with sexual dysfunction (mean age = 31.4)

.19

.08

Meuwissen and Over (1990)

7 students (mean age = 28.8)

 

VBV

Likert

Habituation of sexual arousal.

2 explicit heterosexual film clips with no audio.

.46

Correlation reported for dishabituation (novel) stimulus only.

2 structured fantasies using descriptive slides.

.61

Miller (1999)

 

82 students (mean age = 19.4)

Strain gauge

Lever

Sexual arousal to coercive and non-coercive stimuli.

1 explicit heterosexual film clip.

.19

Mitchell, DiBartolo, Brown, and Barlow (1998)

 

24 volunteers (mean age = 38.5)

Barlow gauge

Lever

Effects of mood. Correlations reported for control conditions, separately for two sessions.

1 explicit film clip.

.93/.99 (session 1/2)

Morokoff (1985)

62 students (mean age = 19)

 

VPA

Likert

Effects of guilt, repression and experience.

1 explicit heterosexual film.

.42

Unstructured fantasy.

.00 (nonsig)

Morokoff and Heiman (1980)

11 healthy volunteers (mean age = 30)

 

VPA

Likert

Comparing sexually functional and dysfunctional women before (session 1) and after (session 2) therapy.

1 explicit heterosexual audiotape, narrated by male.

−.55/−.65 (session 1/2)

1 explicit heterosexual film with no sound.

.00/.00 (nonsig)

3 unstructured fantasies.

.24/.00 (nonsig)

11 with sexual dysfunction (mean age = 29)

Correlation reported across 3 fantasies.

1 explicit heterosexual audiotape, narrated by male.

.00/.00 (nonsig)

1 explicit heterosexual film with no sound.

.00/.00 (nonsig)

3 unstructured fantasies.

.00/.25 (nonsig)

O’Donohue and Geer (1985)

 

40 students (mean age = 20.5)

Barlow & Strain gauge

Likert

Effects of habituation. Correlation reported for first trial.

1 explicit slide (either nude or heterosexual activity).

.45

Osborn and Pollack (1977)

12 students (mean age = 25.2)

 

VPA/VBV

Likert

Effects of two types of erotic literature. Rank order correlations reported.

10 explicit heterosexual stories (“hardcore”).

−.40/.00 (nonsig)

10 heterosexual stories (“erotic realism”).

.00/.00 (nonsig)

Palace and Gorzalka (1990)

16 students (mean age = 28)

 

VBV

Likert

Effects of anxiety in women with vs. without sexual dysfunction.

2 explicit heterosexual film clips.

.00 (nonsig)

16 with sexual dysfunction (mean age = 30)

.00 (nonsig)

Palace and Gorzalka (1992)

16 students (mean age = 26)

 

VBV

Likert

Sexual arousal patterns in women with vs. without sexual dysfunction.

1 erotic heterosexual film clip with no sound, 1 explicit heterosexual film clip with no sound, and 1 erotic heterosexual film clip with sound.

.50/.00/.00 (erotic no sound/ explicit no sound/explicit with sound, nonsig)

16 with sexual dysfunction (mean age = 26)

.00/ .00/ .00 (nonsig)

Payne et al. (2007)

20 volunteers (mean age = 22.2)

 

Thermography

Likert

Effects of arousal on genital and non-genital sensations in women with vulvar vestibulitis syndrome vs. controls.

1 explicit heterosexual film.

.30

20 with vulvar vestibulitis syndrome (mean age = 23.8)

−.05

Peterson and Janssen (2007)

26 students (mean age = 20.3)

19 students (mean age = 20.7)

VPA or Rigiscan

Likert

Role of affect in predicting arousal in men and women.

1 explicit heterosexual film (female-centered).

.06

.46

1 explicit heterosexual film (male-centered).

.20

.52

1 explicit heterosexual film (coercive).

.12

.26

1 explicit heterosexual film (runner-up)

.01

.51

Rogers, Van de Castle, Evans, and Critelli (1985)

10 students (low SAI group) (mean age = 23)

 

VPA

Likert

VPA during erotic conditions and sleep. Groups defined by Sexual Arousal Inventory (SAI) score.

1 explicit heterosexual film.

.36/−.18 (film/fantasy)

10 students (high SAI group) (mean age = 27)

1 10-min period of unstructured fantasy.

.45/.30

Sakheim et al. (1985)

 

8 heterosexual volunteers (mean age = 35)

Barlow gauge

Lever

Sexual orientation and sexual arousal.

2 explicit heterosexual films, gay (male–male) films, and lesbian (female–female) films.

.84/.67/.77 (heterosexual/gay/lesbian)

8 homosexual volunteers (mean age = 35)

  

Correlations reported across 2 sessions.

 

.57/.71/.31

Salemink and van Lankveld (2006)

21 volunteers (mean age = 23.3)

 

VPA

Likert

Effects of distraction among sexually functional vs. dysfunctional women.

1 explicit heterosexual film.

−.02

20 with sexual dysfunction (mean age = 29.4)

Correlations for no distraction condition.

.14

Schacht et al. (2007)

42 volunteers (sexually abused or non-abused; mean age = 24.7)

 

VPA

Likert

Effects of alcohol and instructional set among sexually abused vs. non-abused women.

2 explicit heterosexual film clips.

.01

.03

Correlations across alcohol and demand conditions.

Schaefer et al. (1976)

 

8 healthy volunteers (mean age = 25)

Barlow gauge

% erection estimate

Concordance using % estimate of erection.

Explicit text describing heterosexual intercourse.

.86

Schreiner-Engel et al. (1981)

30 healthy volunteers (mean age = 25)

 

VPA

Likert

Concordance across phases of the menstrual cycle.

1 explicit heterosexual film clip.

.11/.18/.09 (follicular/ ovulatory/ luteal)

1 5-min unstructured fantasy (fantasy 1).

.07/.05/.34

1 5-min unstructured fantasy (fantasy 2).

.32/.09/.45

Seal et al. (2005)

16 students & volunteers (mean age = 21.8)

 

VPA

Likert

Sexual arousal before and after oral contraception use.

1 explicit heterosexual film.

.50/.83 (before /after)

.11/.57 (before /after )

Slob et al. (1990)

24 with diabetes (mean age = 33.6)

 

Thermography

Likert

Sexual arousal in women with diabetes.

1 explicit group sex film, depicting 2 females and 1 male.

.22

10 healthy volunteers (mean age = 31.2)

.69

Slob et al. (1996)

9 healthy volunteers (mean age = 31)

 

Thermography

Likert

Sexual arousal and the menstrual cycle with and without vibrotactile stimulation

1 explicit heterosexual film. Follicular phase.

−.26/.05(with/without vib)

.26/.07

11 healthy volunteers (mean age = 34)

1 explicit heterosexual film. Luteal phase.

.15/.38

.39/.17

Stock (1983)

75 students (mean age missing)

 

VPA

Likert

Effects of sexually violent stimuli.

1 explicit heterosexual audiotape, depicting mutual consent (n = 15).

.17

1 explicit heterosexual audiotape, depicting woman aroused by sexual assault (n = 15).

.33

1 explicit heterosexual audiotape, depicting pain and negative emotional reactions of female rape victim (n = 15).

.29

1 explicit heterosexual audiotape, depicting realistic sexual assault (n = 75).

.00 (nonsig)

ter Kuile, Vigeveno, and Laan (2007)

29 volunteers (mean age = 23.4)

 

VPA

Likert

Stress and arousal in women. Correlation reported for no-stress group.

1 explicit heterosexual film.

.56

Tollison et al. (1979)

 

10 heterosexual students (mean ages missing for all samples)

Barlow gauge

Likert

Sexual arousal in heterosexual, bisexual & homosexual men.

1 explicit gay film clip.

−.15/.65/.45/.63 (gay film/het. film/nude males/nude females)

.48/.88/.64/.63

10 bisexual volunteers

1 explicit heterosexual film clip.

.59/.30/.64/.51

.82/−.21/.37/.65

Slides of nude males.

10 homosexual volunteers

Slides of nude females.

.25/.14/.63/−.52

.17/.50/.90/.66

Van Lankveld and van den Hout (2004)

 

26 volunteers (mean age = 47.9)

Barlow gauge

% erection estimate

Distraction and level of stimulation in sexually functional vs. dysfunctional men.

1 explicit heterosexual film (high explicit).

.28/.17 (high explicit/low explicit)

23 with sexual dysfunction (mean age = 55.7)

Correlations for listen-only condition.

1 heterosexual film (low explicit).

.17/−.03

Weisberg, Brown, Wincze, and Barlow (2001)

 

52 students (mean age = 20.8)

Strain gauge

Likert/% Maximum

Causal attributions and sexual arousal.

1 explicit heterosexual film clip.

.46/.46

.78/.78

Wilson and Lawson (1976)

 

40 students (mean age = 20)

Strain gauge

Likert

Effects of alcohol and alcohol expectancy.

1 explicit black-and-white heterosexual film.

.53

Wilson and Lawson (1978)

40 students (mean age = 22)

 

VPA

Likert

Effects of alcohol and alcohol expectancy.

1 explicit black-and-white heterosexual film

.11

.04

1 explicit black-and-white lesbian film.

.24

.15

Studies reporting both mixed- and between- subjects correlations

Apperloo et al. (2006)

10 volunteers (mean age = 23)

 

VPA

Likert

Effects of testosterone. Correlation calculated across conditions.

1 explicit heterosexual film, 2 neutral film clips and fantasy period.

.51

Cohen, Rosen, and Goldstein (1985)

 

18 mixed sample of men with vs. without sexual dysfunction (mean age = 46.1)

Strain gauge

Likert

EEG study. Correlations reported across 2 sessions.

2 explicit heterosexual films

.71

2 explicit heterosexual audiotapes.

.86

Gaither (2001)

 

20 students (mean age = 20.3)

Strain gauge

Likert

Validity and reliability of new measures of sexual arousal. Correlations reported using 5 mean physiological responses to the 4 sexual activities plus neutral clips.

40 explicit heterosexual film clips, depicting fellatio, cunnilingus, vaginal penetration and anal penetration.

.80

Gaither and Plaud (1997)

 

18 students (mean age = 23.7)

Strain gauge

Likert

Effects of sound and type of sexual activity in sexual films. Correlations reported across sound conditions.

6 explicit heterosexual films, depicting 6 different activities.

.78

Gaither, Rosenkranz, Amato-Henderson, Plaud, and Bigwood (1996)

 

14 students (mean age missing)

Strain gauge

Likert

Sexual arousal and condom use in sexual stimuli. Correlations reported across condom conditions.

10 explicit audiotaped clips narrated by female.

−.07

Hall, Binik, and DiTomasso (1985)

 

20 students (mean age = 22)

Strain gauge

Likert

Concordance.

1 explicit heterosexual audiotape

.66

Correlations reported across gender of narrator.

Heard-Davidson, Heiman, and Kuffel (2007)

10 post-menopausal women (mean age = 56.8)

 

VPA

Likert

Effects of testosterone in postmenopausal women. Correlations reported for placebo conditions.

5 explicit heterosexual film clips.

.67

.71

Heiman et al. (1991)

7 volunteers (mean age = 32)

 

VPA

Likert

Sexual arousal and endocrine response.

2 explicit heterosexual film clips.

.00 (nonsig)

6 volunteers (mean age = 32)

1 explicit heterosexual film clip.

.00 (nonsig)

Henson et al. (1977)

10 students (mean age = 30)

 

Thermography

Likert

Validity of thermography.

1 explicit heterosexual film.

.57

Henson and Rubin (1978)

8 students (mean age = 28)

 

VBV

Likert

Comparison between two measures of genital arousal.

1 explicit heterosexual film.

.40

Thermography

.84

Henson et al. (1979)

8 volunteers (mean age = 24)

 

VPA

Likert

Comparing different measures of arousal in women.

1 explicit heterosexual film.

.76

VBV

.42

Thermography

.82

Hoon et al. (1982)

13 students (mean age = 24)

 

VPA/VBV

Likert

Menstrual cycle and arousal.

5 erotic heterosexual audiotapes.

.00/.33 (nonsig)

Thermography

3 min of unstructured fantasy.

.00 (nonsig)

VPA/VBV

Correlations reported across 5 sessions and cycle phases.

.00/.00 (nonsig)

Kockott, Feil, Ferstl, Aldenhoff, and Besigner (1980)

 

16 volunteers (mean age = 32.4)

Strain gauge

Visual-auditory scale

Sexual arousal and male sexual dysfunction. Correlations are averaged across 2 sessions.

2 heterosexual films, depicting foreplay.

.18

8 volunteers (mean age = 45.1)

.33

10 diabetic men (mean age = 47)

.23

8 with primary erectile dysfunction (mean age = 34)

.52

8 with secondary erectile dysfunction (mean age = 34)

−.10

7 with premature ejaculation during all sex acts (mean age = 33.6)

.39

9 with premature ejaculation during intercourse only (mean age = 33.6)

.57

Koukounas and McCabe (2001)

 

30 students (mean age = 29.5)

Strain gauge

Likert

Effects of attention and emotion.

5 explicit heterosexual films (no audio).

.83

Koukounas and Over (1999)

 

16 students (mean age = 21.9)

Strain gauge

Likert

Effects of attention and habituation. Correlations calculated over 18 habituation trials.

18 explicit heterosexual film clips (no audio)

.91

Letourneau and O’Donohue (1997)

25 students (mean age = 21)

 

VPA

Likert

Classical conditioning of female arousal. Correlations reported across 5 sessions.

50 explicit heterosexual film clips (female-centered).

.25

Pras et al. (2003)

9 medical patients treated with radiotherapy for gynecological cancer (mean age = 49.2)

 

VPA

Likert

Assessing feasibility of VPA to measure effects of radiotherapy on sexual function. Correlations reported across 3 films.

3 explicit heterosexual films (5, 9 and 10 min in length).

.50

8 healthy volunteers (mean age = 43.3)

−.12

Slob et al. (1991)

12 volunteers not using oral contraceptives (mean age = 25.9)

 

Thermography

Likert

Menstrual cycle phase and sexual arousal.

2 explicit heterosexual film clips (female-centered).

.10

.33

12 volunteers using oral contraceptives (mean age = 22.8)

.49

.75

Smith and Over (1987)

 

8 students (mean age = 28)

Strain gauge

Likert

Habituation of fantasy-induced arousal.

2 min of structured fantasy.

.79

Steinman, Wincze, Sakheim, Barlow, and Mavissakalian (1981)

8 students (mean age = 23)

8 students (mean age = 22)

VPA or strain gauge

Continuous/Likert

Comparison of male and female arousal. Correlations reported across all 4 films.

4 explicit films, depicting heterosexual sex, group sex and homosexual sex (male–male or female–female).

.48/.52

.80/.80

Wilson, Niaura, and Adler (1985)

 

32 students (mean age = 22.5)

Strain gauge

Visual-auditory scale

Effects of alcohol and selective attention. Same stimulus shown 4 different times under different conditions (instructional set and alcohol manipulation).

4 explicit heterosexual audiotapes, narrated by male.

.78

Wouda et al. (1998)

18 with dyspareunia (mean age = 25)

 

VPA

Likert

Vaginal plethysmography in women with dyspareunia.

3 explicit heterosexual film clips, depicting oral sex and vaginal intercourse.

.08

16 controls (mean age = 25)

.25

Notes: Correlations for average genital and subjective arousal reported when both max and mean values are available. Correlations are reported for control and placebo conditions, unless otherwise noted. Correlations that were reported as statistically nonsignificant were assigned value of zero (identified in table as “nonsig”). Sample is heterosexual unless stated otherwise

VPA vaginal pulse amplitude, VBV vaginal blood volume

Overall Gender Difference

We predicted there would be a gender difference in subjective-genital agreement, with male samples producing higher estimates than female samples. The total set of 132 studies produced 184 subjective-genital correlations for men and 280 correlations for women. There was a significant gender difference, with an average correlation of .56 (95% CI, .50 to .62) for men, and .25 (95% CI, .21 to .28) for women. The corresponding values for the correlation between perception of genital arousal and actual genital arousal were .73 (95% CI, .64 to .82) for men (based on 62 correlations) and .23 (95% CI, .18 to .27) for women (115 correlations); this was also a statistically significant gender difference.

A more convincing analysis would use independent samples as the unit of analysis, so that the number of effect sizes equaled the number of independent samples recruited in the 132 studies. The results of this independent samples analysis are shown in Table 2. Results showed that the average correlations were positive and significantly different from zero for both sexes, and the gender differences in the size of the correlations were again significant. The average Rgen correlations were significantly higher than the average Rsub correlations for men only.
Table 2

Correlations for all independent samples, by sex

 

Men

Women

Subjective

Genital

Subjective

Genital

Average (r)

.66

.76

.26

.23

95% confidence intervals

.57 to .75

.63 to .89

.21 to .32

.17 to .30

Samples (K)

81

29

108

55

Sample size (n)

1,732

630

2,345

1,305

Number of studies

57

19

74

39

Homogeneity (Q)

216.0

56.3

147.6

66.0

p < .0001

p < .005

p < .01

p = .15

Note: Subjective = correlation between subjective arousal and actual genital arousal (Rsub). Genital = correlation between perception of genital arousal and actual genital arousal (Rgen). A significant Q value means that effect sizes are not homogeneous, suggesting the presence of one or more moderator variables

In the next analysis, we examined the gender difference for a subset of selected independent samples. We analyzed correlations obtained from only non-clinical samples and from studies using external stimuli (i.e., all participants within a study were exposed to the same visual, auditory, or text stimuli) and no experimental manipulations other than variation in the content of the sexual stimuli. We further restricted our analysis to samples for which participants were asked to estimate their subjective arousal during or right after stimulus presentation (not after the end of the study session), who were not asked to focus their attention on their genital or extra-genital sensations, who did not receive tactile stimulation of their genitals, and who were not exposed to distraction tasks during the presentation of the stimuli (e.g., concurrently monitoring numbers). This selected subset of independent samples represents what we consider to be a stronger test of the gender difference in agreement between subjective and genital measures of sexual arousal. Results for this selected subset of independent samples are shown in Table 3 and were very similar to the results obtained for all correlations and all independent samples. Both men and women produced correlations that were positive and significantly different from zero, and men produced significantly higher correlations than women. Women showed significantly lower correlations for perceptions of genital arousal than for subjective sexual arousal.
Table 3

Correlations for all independent samples, by sex, for selected studies

 

Men

Women

Subjective

Genital

Subjective

Genital

Average (r)

.69

.79

.31

.20

95% confidence intervals

.56 to .82

.60 to .99

.24 to .38

.12 to .29

Samples (K)

45

16

65

32

Sample size (n)

987

366

1,349

678

Number of studies

36

13

55

26

Homogeneity (Q)

133.0

37.7

84.1

33.3

 

p < .0001

p < .001

p < .05

p = .35

Note: Selected studies refer to basic samples, without experimental manipulations, and with standard external sexual stimuli. Subjective = correlation between subjective arousal and actual genital arousal (Rsub). Genital = correlation between perception of genital arousal and actual genital arousal (Rgen). A significant Q value means that effect sizes are not homogeneous, suggesting the presence of one or more moderator variables

Because more correlations were coded as zero for female than male samples in this analysis, we calculated the average subjective-genital agreement for both sexes after excluding any samples for which a correlation was reported as non-significant and coded as zero (even if the sample correlation was a combination of multiple correlations, only one of which was coded as zero). The resulting concordance estimates are shown in Table 4. Men continued to show significantly greater concordance than women, suggesting that lower concordance estimates for women were not an artifact of coding statistically nonsignificant correlations as equal to zero. We did not examine agreement between perception of and actual genital arousal any further.
Table 4

Correlations for all independent samples, by sex, for selected studies, excluding studies with r = .00 (number of studies excluded in parentheses)

 

Men

Women

Subjective

Genital

Subjective

Genital

(0)

(1)

(9)

(2)

Average (r)

.69

.84

.33

.22

95% confidence intervals

.56 to .82

.66 to 1.02

.26 to .40

.12 to .31

Samples (K)

45

15

56

30

Sample size (n)

987

350

1,170

619

Number of studies

36

12

46

24

Homogeneity (Q)

133.0

29.9

73.9

32.8

p < .0001

p < .001

p < .05

p = .28

Note: Selected studies refer to basic samples, without experimental manipulations, and with standard external sexual stimuli. Subjective = correlation between subjective arousal and actual genital arousal (Rsub). Genital = correlation between perception of genital arousal and actual genital arousal (Rgen). A significant Q value means that effect sizes are not homogeneous, suggesting the presence of one or more moderator variables

Finally, we examined all studies that reported concordance data from both men and women. These studies are perhaps the most relevant to the gender difference question because men and women were exposed to the same procedures and very similar laboratory conditions (except for the measure of genital arousal in most cases). The 13 studies produced 17 independent samples for each sex (total of 375 males and 424 females). Results showed a relatively high degree of agreement between subjective and genital arousal, for both sexes: .66 for men (95% CI, .49 to .83) and .44 for women (95% CI, .30 to .57), with men, again, showing a significantly higher degree of concordance. Results did not change when we selected only the samples (11 of 13 studies) that also met the restriction criteria described above for the analyses reported in Table 3; men (r = .71, CI, .50 to .91; n = 288) still showed a higher degree of agreement than women (r = .44, CI, .28 to .59; n = 317).

In sum, men and women showed significant agreement between self-reported (subjective) arousal and genital (objective) measures of arousal, with men showing a significantly higher level of agreement in all of our analyses. The variation in observed correlations was more heterogeneous than expected by chance for both genders, indicating the presence of one or more moderators influencing subjective-genital agreement. In the following sections, we examine potential moderators of degree of concordance.

We analyzed studies that allowed us to test methodological and then theoretical moderators. We initially planned to focus on studies that used within-subject correlations, because we were most interested in individual variation in response to sexual stimuli, but the number of such studies was often too small for meaningful analysis. We present results separately for within- and between-subject correlations when possible. As shown below, within- and between-subjects correlations tended to produce similar effect sizes. All moderator analyses were performed using the selected subset of samples—basic samples, external stimuli, no experimental manipulations except for the content of the stimuli—unless otherwise noted.

Methodological Moderators

Stimulus Modality

Stimulus modality can be distinguished as visual (pictures, movies) or non-visual (recorded stories, self-generated fantasies). Table 5 shows that a significant gender difference was found regardless of stimulus modality. There was no evidence that subjective-genital agreement was higher for women when they were exposed to non-visual stimuli; in fact, subjective-genital agreement was nonsignificantly lower in these conditions.
Table 5

Correlations between subjective and genital arousal by stimulus modality (selected studies, between-subjects correlations)

 

Visual only

Non-visual only

Men

Women

Men

Women

Average (r)

.57

.30

.67

.25

95% confidence intervals

.38 to .75

.22 to .38

.53 to .82

.08 to .42

Samples (K)

23

43

7

15

Sample size (n)

572

996

216

429

Number of studies

16

37

7

13

Homogeneity (Q)

78.5

55.7

6.3

33.9

p < .0001

p = .08

p = .39

p < .005

Note: Selected studies refer to basic samples, without experimental manipulations, and with standard external sexual stimuli. A significant Q value means that effect sizes are not homogeneous, suggesting the presence of one or more moderator variables

Seven studies (n = 222) directly compared women’s responses to visual versus non-visual stimuli, producing average correlations of .37 (95% CI, .23 to .51) and .13 (95% CI, −.04 to .31), respectively; once again, higher correlations were obtained for visual stimuli.

A significant gender difference in agreement was also obtained when examining studies using only self-generated fantasy as a stimulus: .87 (95% CI, .62 to 1.12) for men (4 samples, n = 75), and .08 (95% CI, −.07 to .23) for women (7 samples, n = 197).

Stimulus Variation in Content or Modality

Studies that presented men with varied stimulus content or modalities did not produce larger correlations than studies that presented men with no variation in content or modality (see Table 6). Studies of women showed a different pattern: Women presented with more stimulus variation produced significantly larger correlations than women presented with no stimulus variation. A significant gender difference in concordance was eliminated for the small number of studies that varied stimulus content or modality. Unfortunately, only two studies (regardless of type of correlation) exposed the same participants to both variation and no-variation conditions.
Table 6

Correlations between subjective and genital arousal by stimulus variation (selected studies, between-subjects)

 

No variation in content or modality

Variation in content or modality

Men

Women

Men

Women

Average (r)

.62

.26

.60

.49

95% confidence intervals

.42 to .82

.18 to .34

.47 to .74

.35 to .63

Samples (K)

22

46

8

6

Sample size (n)

595

1,019

244

208

Number of studies

18

40

5

4

Homogeneity (Q)

86.3

64.5

6.7

4.0

p < .0001

p < .05

p = .46

p = .56

Note: Selected studies refer to basic samples, without experimental manipulations, and with standard external sexual stimuli. A significant Q value means that effect sizes are not homogeneous, suggesting the presence of one or more moderator variables

Number of Stimulus Trials

The relationship between number of stimulus trials (range of 1–16) and degree of subjective-genital agreement was examined separately for men and for women. For studies that reported between-subject correlations, the relationship was small and nonsignificant for both genders: r(23, n = 710) = .10, p = .62 for men, and r(47, n = 1,142) = −.17, p = .26 for women. Studies reporting within-subject correlations, however, suggested a positive but non-significant relationship for both genders: r(10, n = 164) = .39, p = .21 for men, and r(7, n = 69) = .45, p = .23 for women. Non-parametric correlations between number of trials and subjective-genital agreement produced similar results. Numbers in parentheses refer to degrees of freedom and sample size, respectively.

Stimulus Duration

A similar analysis was conducted for stimulus duration. For men (range of 60–9,600 s), there was no evidence that stimulus duration was associated with concordance, whether using between-subjects, r(20, n = 662) = .16, p = .48, or within-subjects correlations, r(10, n = 164) = .01, p = .97. For women (range of 120–2,400 s), the relationship direction depended on the type of correlation: r(46, n = 1,104) = −.21, p = .16 for between-subjects, and r(7, n = 123) = .69, p < .05 for within-subjects. Non-parametric correlations produced the same pattern of results. Overall, then, concordance might be greater in women when stimuli are presented for a longer period of time.

Contiguous Versus Post-Trial Assessment of Subjective Arousal

Table 7 shows the usual gender difference for studies asking participants to report their subjective arousal at the end of each stimulus (post-stimulus), as indicated by the nonoverlapping 95% confidence intervals. The gender difference was smaller and no longer statistically significant when we examined studies using a contiguous method of assessing subjective sexual arousal, due to a lower degree of subjective-genital agreement for men. Women’s concordance did not seem to be affected by the timing of the subjective assessment.
Table 7

Correlations between subjective and genital arousal by timing of subjective assessment (selected studies, between-subjects)

 

Post-trial

Contiguous

Men

Women

Men

Women

Average (r)

.66

.29

.44

.30

95% confidence intervals

.47 to .85

.20 to .38

.22 to .67

.09 to .50

Samples (K)

22

45

6

7

Sample size (n)

535

1,005

196

206

Number of studies

18

39

3

5

Homogeneity (Q)

74.0

67.2

8.5

9.9

p < .0001

p < .05

p = .13

p = .13

Note: Selected studies refer to basic samples, without experimental manipulations, and with standard external sexual stimuli. A significant Q value means that effect sizes are not homogeneous, suggesting the presence of one or more moderator variables

Rgen Versus Rsub

This analysis was presented earlier in the section examining the overall gender difference. We note that the correlation between perception of and actual genital arousal was nonsignificantly higher than the correlation for subjective sexual arousal and genital sexual arousal, but for men only; asking women to report their perception of genital sensations resulted in lower, rather than higher, correlations. We also examined the average correlations for selected studies that reported both types of correlations on the same participants. For men, the average Rsub correlation was .61 (95% CI, .45 to .76; k = 13, n = 315), and the average Rgen correlation was .78 (95% CI, .55 to 1.01; n = 314). For women, the average Rsub correlation was .23 (95% CI, .11 to .35, k = 23, n = 522), and the average Rgen correlation was .20 (95% CI, .11 to .29). This pattern of results was very similar to what was reported for all selected studies.

Female Genital Arousal Measurement

Table 8 shows that the two components of vaginal photoplethysmography (VPA and VBV) produced similar concordance estimates. Thermography produced significantly higher subjective-genital correlations. We examined the three studies that directly compared men and women with thermography, regardless of the type of correlations reported. The first study produced correlations of .71 and .60 for men and women, respectively; the second, .73 and .70, and the third, .31 and .63.
Table 8

Correlations between subjective and genital arousal for women by type of physiological measure (selected studies, between-subjects)

 

VPA

VBV

Thermography

Average effect size (r)

.27

.28

.55

95% confidence intervals

.17 to .35

.07 to .49

.28 to .82

Samples (K)

42

7

6

Sample size (n)

1,018

118

97

Number of studies

35

7

5

Homogeneity (Q)

59.2

6.4

6.5

p < .05

p = .38

p = .26

Note: Selected studies refer to basic samples, without experimental manipulations, and with standard external sexual stimuli. A significant Q value means that effect sizes are not homogeneous, suggesting the presence of one or more moderator variables

VPA Versus VBV

The six studies that directly compared VPA and VBV (regardless of type of correlation) suggest a small, but non-significant advantage for VBV (r = .36, 95% CI, .13 to .60) over VPA (r = .23, 95% CI, .02 to .45) in terms of subjective-genital agreement.

Type of Correlation

In this analysis, reported in Tables 9 and 10, we examined the gender difference as a function of the type of correlation used in the study, focusing on the same selected subset of samples used in the prior analyses.
Table 9

Correlations between subjective and genital arousal for men by type of correlations (selected samples)

 

Within-subject

Between-subject

Mixed

Average effect size (r)

.91

.62

.66

95% confidence intervals

.70 to 1.12

.46 to .78

.28 to 1.04

Samples (K)

12

28

7

Sample size (n)

164

731

105

Number of studies

10

21

6

Homogeneity (Q)

14.3

91.4

16.5

p = .22

p < .0001

p < .05

Note: Selected studies refer to basic samples, without experimental manipulations, and with standard external sexual stimuli. A significant Q value means that effect sizes are not homogeneous, suggesting the presence of one or more moderator variables

Table 10

Correlations between subjective and genital arousal for women by type of correlations (selected studies)

 

Within-subject

Between-subject

Mixed

Average effect size (r)

.43

.29

.26

95% confidence intervals

.24 to .63

.21 to .37

.02 to .50

Samples (K)

10

50

7

Sample size (n)

133

1,144

88

Number of studies

9

42

6

Homogeneity (Q)

8.1

72.9

2.5

p = .52

p < .05

p = .87

Note: Selected studies refer to basic samples, without experimental manipulations, and with standard external sexual stimuli. A significant Q value means that effect sizes are not homogeneous, suggesting the presence of one or more moderator variables

The results presented in Tables 9 and 10 were remarkably consistent across type of correlation. The average correlations were positive and significantly different from zero for both sexes, and the gender difference was present for all three types of correlation. In men, within-subjects correlations were significantly larger than between-subjects or mixed correlations. In women, within-subjects correlations were significantly larger than between-subjects correlations. Only one study (of men) reported more than one type of correlation, so we could not directly compare subjective-genital agreement across type of correlation in the same set of studies (Mavissakalian, Blanchard, Abel, & Barlow, 1975).

Number of Data Points

The range of data points for men was 8–240, and for women it was 7–115. Correlations based on larger numbers of data points did not produce higher between-subjects correlations for men, r(24, n = 720) = −.13, p = .54, or women, r(47, n = 1,142) = .07, p = .62. Studies using within-subject correlations showed a trend toward a negative relationship in men, r(10, n = 164) = −.39, p = 21, but not in women, r(7, n = 123) = .08, p = .84. Non-parametric correlations were very similar.

Average Sample Age

The sample age range was 19–38.5 for men and 19–48 for women. Focusing on studies reporting between-subject correlations (619 men and 1,059 women), there was a near-significant association between the average sample age and subjective-genital agreement for men, r(20) = .42, p = .05, and a small and non-significant association for women, r(42) = .18, p = .25. Within-subjects correlations (164 men and 113 women) showed a non-significant negative association for men, r(10) = −.27, p = .40, and a near-zero association for women, r(6) = .084, p = .84. Non-parametric correlations produced very similar results.

Hormones Among Female Samples

There was no significant difference between female samples distinguished according to whether they were taking oral contraceptives. Examining the between-subjects correlations, the 14 samples of women who were not taking oral contraceptives (10 studies, n = 259) produced an average correlation of .39 (.26 to .53), whereas the nine samples of women who were taking oral contraceptives (9 studies, n = 239) produced an average correlation of .32 (.10 to .53).

Theoretically-Derived Moderators

Female-Centered Stimuli

Table 11 presents the results of the comparison when participants were exposed to female-centered stimuli versus typical, commercially available sexual content. The results (between-subjects) suggest that the gender difference in subjective-genital agreement was observed with both types of stimuli. The gender difference was larger with typical stimuli, mostly because the degree of agreement was lower in men when they were presented with female-centered erotica, and female-centered sexual stimuli did not increase subjective-genital agreement among women.
Table 11

Correlations between subjective and genital arousal by stimulus type (selected studies, between-subjects)

 

Female-centered

Not female-centered

Men

Women

Men

Women

Average (r)

.47

.29

.65

.27

95% confidence intervals

.33 to .62

.17 to .40

.45 to .86

.18 to .36

Samples (K)

8

28

20

23

Sample size (n)

209

680

538

526

Number of studies

5

22

17

21

Homogeneity (Q)

5.9

48.4

77.8

18.7

p = .55

p < .01

p < .0001

p = .66

Note: Selected studies refer to basic samples, without experimental manipulations, and with standard external sexual stimuli. A significant Q value means that effect sizes are not homogeneous, suggesting the presence of one or more moderator variables

Erotic Versus Explicit Stimuli

There were very few studies that used erotic (less explicit) stimuli. As shown in Table 12, there was a significant gender difference in agreement among studies involving explicit stimuli. The five studies of women presented with erotic stimuli suggest a similar low degree of subjective-genital agreement.
Table 12

Correlations between subjective and genital arousal by stimulus type (selected studies, between-subjects)

 

Explicit

Erotic

Men

Women

Men

Women

Average (r)

.62

.29

.27

95% confidence intervals

.46 to .77

.21 to .37

−.07 to .60

Samples (K)

28

48

5

Sample size (n)

732

1,109

88

Number of studies

21

40

5

Homogeneity (Q)

91.4

69.9

7.4

p < .0001

p < .05

p = .12

Note: Selected studies refer to basic samples, without experimental manipulations, and with standard external sexual stimuli. A significant Q value means that effect sizes are not homogeneous, suggesting the presence of one or more moderator variables

Basic Versus Clinical Samples

We predicted that basic samples would produce higher estimates of subjective-genital agreement than clinical samples of sexually dysfunctional participants. For this analysis, we examined studies that directly compared basic with clinical, sexually dysfunctional samples, regardless of type of correlation (because the same type of correlation was used to compare basic and clinical samples within each study). Otherwise the same restrictions were applied (selected studies).

Three studies compared male basic and sexually dysfunctional samples. Subjective-genital correlations were positive and similar for both groups of men: r = .49 (.19 to .80) for the 4 basic samples (n = 53), and r = .49 (.18 to .79) for the 6 sexually dysfunctional samples (n = 59). Eleven studies compared basic and sexually dysfunctional samples of women, with no significant difference found between the two groups of women: r = .09 (−.07 to .25) for the 11 basic samples (n = 231), and r = .04 (−.09 to .17) for the 11 sexually dysfunctional samples (n = 253).

We next examined all selected studies (between-subjects correlations) of sexually dysfunctional men or women. There were 10 studies of women but only one study of men. The average correlation for sexually dysfunctional women (n = 235) was .04 (−.10 to .17). This average correlation can be directly compared to non-clinical samples of women in Table 10 (between-subjects average correlation of .29) and suggests that sexually dysfunctional women show even lower concordance than sexually functional women. It is unclear, however, why the non-dysfunctional women in studies reporting on both dysfunctional and functional women produced such low correlations.

Funnel Graph

A funnel graph allowed us to examine for a publication bias towards larger (or smaller) effect sizes. Publication bias is a concern in meta-analyses of this kind because statistically significant findings may be more likely to be deemed interesting and accepted for publication. Figure 1 displays a funnel graph illustrating the relationship between subjective-genital correlation and sample size, by gender, for all independent and selected samples. First, it is clear that there were two clusters of correlations, one for the male samples and one for the female samples, with some overlap between the two, especially for studies producing low correlations. Second, the scatterplot shows heteroscedasticity—higher variance in correlations for smaller than for larger sample sizes, as expected from the Central Limit Theorem if there is no publication bias towards either larger or smaller effect sizes. Third, the largest samples show correlations that were fairly close to the overall mean for each sex, again as expected if there is no publication bias.
https://static-content.springer.com/image/art%3A10.1007%2Fs10508-009-9556-9/MediaObjects/10508_2009_9556_Fig1_HTML.gif
Fig. 1

Funnel graph of the (un-weighted) correlations between subjective and genital arousal, using selected samples (top and bottom horizontal lines represent the male and female unweighted averages, respectively)

Discussion

The present study examined the gender difference in concordance between subjective and genital measures of sexual arousal. An overall gender difference in concordance was found across all samples, across all independent samples, using a selected subset of independent samples, and in studies that included both female and male samples. In almost all of the comparisons, men produced higher subjective-genital correlations than women; only two of the comparisons—studies using contiguous assessments of self-reported arousal and studies presenting varied stimulus content or modality—showed no statistically significant gender differences in concordance but, in both cases, men still tended to show greater concordance than women. In none of the analyses did we find that women produced higher concordance estimates than men. Based on these convergent and consistent results, we conclude that a gender difference in concordance exists, with men demonstrating higher subjective-genital agreement than women.

Is the Gender Difference Due to Methodological Artifact?

After determining that a gender difference existed, we searched for potential moderators of concordance between subjective and genital sexual arousal. Although we hypothesized that many of the methodological and theoretically-derived moderators would help explain variation in female correlations specifically, our results showed that the female correlations were often homogeneous (i.e., variation in correlations did not exceed that expected by chance) and thus did not require further examination of moderators to explain variability in the estimates that were obtained. In contrast, male correlations were typically heterogeneous. Moderators associated with methodological variation did not fully account for the gender difference in concordance because they were not significantly or strongly correlated with concordance estimates. Of the methodological moderators that we examined, only two—method of assessing self-reported sexual arousal and stimulus variation—produced no statistically significant gender difference in concordance when between-subjects correlations were examined.4 These particular variables are discussed in further detail below.

In the following sections, we highlight results of the moderator analyses that we believe have implications for the design and interpretation of sexual psychophysiology research. Though our results suggest only two of these variables might help explain the gender difference in concordance, other moderators may still influence the strength of concordance within the sexes. These selected results are discussed in the following order: stimulus characteristics; assessment of subjective sexual arousal; assessment of genital arousal; statistical methods; and individual differences.

Stimulus Characteristics

Number of Stimulus Trials

The relationship between concordance and number of stimulus trials differed for women and men and differed by type of correlation. For men, both between- and within-subjects correlations tended to be related to number of stimulus trials. For women, no relationship was observed for between-subjects correlations, but a larger and positive, though still not statistically significant, association was found for within-subjects correlations. This can be interpreted as follows: Across a group of women, giving each woman more opportunities to attend to and report her sexual arousal is not related to higher concordance, but when concordance is estimated using within-subjects correlations, more opportunities to report sexual arousal tend to be associated with higher concordance. This suggests that within-subjects concordance might be influenced by learning for both women and men. Alternatively, this could also be a result of the fact that more data points for within-subjects correlations may lead to more reliable estimates of subjective-genital agreement. Further research directly manipulating number of stimulus trials and observing the effect on concordance in women and men is necessary to test these hypotheses.

Stimulus Modality

We compared concordance for visual versus nonvisual or fantasy stimuli, predicting that women would show greater concordance for nonvisual modalities. The results were contrary to our prediction; for women, the highest estimates of concordance were obtained for visual stimuli, followed by nonvisual and then fantasy stimuli. Seven studies that measured women’s sexual responses to both visual and nonvisual stimuli within the same experiment also produced greater concordance estimates for visual sexual stimuli. This effect may be related to the typically lower levels of sexual arousal obtained using nonvisual modalities of sexual stimuli (Sakheim et al., 1985). Concordance may be attenuated when levels of sexual arousal are lower because women are less able to detect changes in vaginal blood flow when there is limited variability in genital responding (Heiman, 1977).

This speculation about concordance and stimulus modality assumes that subjective sexual arousal is related to the detection of genital changes associated with sexual arousal. A test of this hypothesis conducted by Laan, Everaerd, van der Velde et al. (1995), however, found that concordance was not affected by the magnitude of genital sexual arousal. Another possibility is that audiovisual sexual stimuli occupy a greater number of sensory channels and thereby recruit greater attention to sexual stimuli, therefore leading to greater sexual responses, both subjectively and genitally (Koukounas & McCabe, 1997).

Men showed an opposite trend for stimulus modality: their highest concordance was observed for fantasy stimuli, followed by nonvisual and then visual stimuli, though none of these estimates were significantly different from each other. For men, stimulus modality did not affect concordance, even though men tended to produce greater subjective or genital sexual arousal to visual versus other modalities of sexual stimuli (Heiman, 1977).

Female-Centered Stimuli

Past research has shown that women experience greater positive affect and subjective arousal to female-centered stimuli (Laan et al., 1994; Mosher & Maclan, 1994). We predicted that affective responses to sexual stimuli would influence subjective-genital agreement. Viewing female-centered stimuli did not, however, produce greater concordance among women. The gender difference in concordance was found for both female-centered and typical, commercially available sexual films. Men showed significantly lower concordance for female-centered stimuli, resulting in a smaller though still significant gender difference for studies that presented female-centered stimuli.

This last result may reflect the fact that female-centered stimuli are less likely to depict explicit sexual intercourse and sustained close-ups of genital interactions, and any such scenes tend to be shorter in duration than in typical, commercially available films. The absence or relatively lower frequency of sexually explicit cues may influence subjective sexual arousal more than penile response among men, thus producing lower concordance. In addition, typical sexual films are more likely to focus on male pleasure and control over sex acts, and these elements may contribute to greater absorption into the stimulus and greater subjective (but not genital) sexual arousal among men. Because our data set did not include enough studies, we could not examine male sexual responses to erotic versus explicit stimuli, so our explanation for the difference in male concordance according to stimulus explicitness is only speculative. We also note that this analysis examined the role of affect in subjective-genital agreement indirectly, through the use of different types of sexual stimuli. To more directly determine the role of affect in concordance, studies designed to manipulate affect and examine the impact on subjective and genital responses need to be conducted.

Stimulus Variation

Studies that included stimuli varying in content or modality produced significantly greater positive correlations for women, but not for men, and thus produced no significant gender difference. This result may reflect two different effects. The first reflects general principles in psychophysics and psychometrics and is equally applicable to men and women: Greater variation in stimulus content and stimulus modality should produce greater variation in sexual response, and this can make it easier for participants to detect changes in their subjective or genital response. This would not explain, however, why we found an effect for women but not for men. The second effect is more applicable to women: Perceptions of internal states are thought to be more influenced by external cues in women while, in men, perceptions are more dependent on internal cues such as the physical signatures of emotional states (Pennebaker & Roberts, 1992). Sexual stimulus properties may represent salient external cues that women can use to more accurately estimate their subjective sexual arousal, and thus to produce higher correlations with their genital responses. The gender difference in the importance of internal versus external cues is discussed in greater detail below.

Assessment of Subjective Sexual Arousal

Perception of Genital Responding versus Assessment of Subjective Arousal

Sexual psychophysiology studies often differ in their operationalization of subjective sexual arousal. We hypothesized that asking participants to report perceptions of their genital changes might yield a smaller gender difference because participants are given a specific perceptual task to complete, whereas reporting mental sexual arousal is more global and impressionistic. Moreover, asking participants to report their perception of genital sensations can be viewed as a form of attention manipulation, as people direct their attention to monitoring physical cues.

Gender differences in concordance were found for both subjective sexual arousal and perception of genital sensations. For men, estimates of concordance were greater when men were asked to report their perception of genital sensations versus their subjective feelings of sexual arousal whereas, for women, no significant difference was found when comparing the two forms of subjective appraisal. The gender difference remained in the subset of studies where women and men were asked to report both subjective sexual arousal and perception of genital sensations. In these studies, men continued to show greater concordance for perception of genital sensations, and no significant difference between the two forms of subjective appraisal was found for women. Greater attention to physical cues increased concordance between subjective and genital sexual arousal only among men. Together, these results suggest that the gender difference in concordance cannot be entirely explained by a gender difference in the visibility and awareness of external genitalia.

Timing of Assessing Self-Reported Arousal

Contiguous assessment of sexual arousal produced no significant gender difference in concordance. Contiguous assessment was associated with lower concordance among men, but was not associated with greater concordance among women. However, relatively few studies used contiguous assessment compared to post-trial ratings, and only two studies directly compared men and women using contiguous assessments (Chivers et al., 2004, 2007).

Other studies using contiguous assessment of sexual arousal have shown that men have lower penile responses when they are asked to monitor their subjective sexual arousal while simultaneously watching sexual stimuli, possibly because of distraction (Geer & Fuhr, 1976; Wincze et al., 1980). At the same time, research on the effects of cognitive distraction on subjective sexual arousal elicited by visual stimuli suggests that, for men, subjective feelings of arousal remain stable despite distraction (Pryzbyla & Byrne, 1984). Thus, it is possible that contiguous assessment of self-reported arousal is a form of cognitive distraction that reduces penile responding but does not similarly attenuate subjective appraisals of sexual arousal, thereby resulting in lower concordance for men. Decrements in penile responding during contiguous assessment of self-reported arousal may also occur because contiguous assessment could function as a form of counter-productive, third person attention to sexual response that ultimately interferes with the development of erection, similar to the spectatoring process described by Masters and Johnson (1970). Whatever the cause, for men, concordance is maximized by using post-trial assessments of self-reported sexual arousal.

Among women, contiguous assessment of sexual arousal tends to increase concordance, although not significantly so. Post-trial assessments of sexual arousal may be more prone to reporting biases among women. Alternatively, completing a simultaneous self-assessment task may reduce discomfort elicited in women when watching sexual stimuli. Supporting this latter hypothesis, studies of the effects of distraction on subjective and genital sexual arousal have shown that woman have higher concordance during distraction conditions (Adams, Haynes, & Brayer, 1985). Only two studies included in the meta-analysis directly compared men and women using a contiguous assessment of subjective sexual arousal, however, so it is clear that more research is needed to examine this possible gender difference.

Assessment of Genital Arousal

Device Used to Assess Female Genital Arousal

Vaginal photoplethysmography measures haemodynamic events that may not be perceptible to women (Henson et al., 1979). Women’s reports of feeling sexually aroused may, therefore, be more strongly related to other physiological cues that are more available to conscious awareness. Changes in genital temperature, measured using thermography, may yield stronger concordance in women. The data supported this hypothesis: Concordance estimates obtained using VPA and VBV were significantly lower than estimates obtained using thermography, and the magnitude of the correlation obtained with thermography was in the range of the estimates reported for men. Using thermography may, therefore, yield greater estimates of concordance for women. We note, however, that the total number of studies employing thermography is still small. Whether thermography produces more valid estimates of concordance remains to be confirmed with more studies comparing genital assessment methods.

Three studies have compared concordance estimates obtained using thermography for both women and men. The first reported high correlations for both sexes (Abramson, Perry, Seeley, Seeley, & Rothblatt, 1981). The second study, which used groin skin temperature as the objective measure of sexual response, reported higher concordance for women (Rubinsky, Hoon, Eckerman, & Amberson, 1985). This interesting result suggests that nongenital temperature change may be among the physical cues women use to appraise their state of sexual arousal. The third, conducted by Kukkonen, Binik, Amsel, and Carrier (2007), reported gender differences in subjective sexual arousal only for assessments during the first 5 min of the stimuli, whereas no gender difference was found for the latter two time periods. These results suggest that, for women, development of subjective sexual arousal that mirrors genital responding takes longer than 5 min, at least when assessed using thermography. This is an interesting finding because the gender difference in concordance may be related to the length of laboratory stimuli. In the present meta-analysis, however, stimulus length was unrelated to concordance in women but was related to concordance in men.

Other factors may account for the high female concordance and lack of gender difference in subjective-genital agreement reported in the Kukkonen et al. (2007) study. Concordance was calculated using between-subjects correlations, using pairs of data points from the baseline, neutral, sexual, and humorous conditions. Only in the sexual condition, however, did subjective and genital sexual arousal increase significantly from baseline. Variability in genital and subjective responding in the nonsexual conditions was low. Recent research using vaginal photoplethysmography has shown that calculating subjective-genital agreement across both nonsexual and sexual stimuli increases concordance estimates (Suschinsky et al., 2009). These authors found that concordance estimates were high for both women and men (r = .48 and .53, respectively) and showed no significant gender difference when concordance was calculated using data from both neutral and nonsexual stimuli. The gender difference re-emerged when concordance was calculated using only sexual stimuli, r = .29 and .60, respectively. Further research is needed to determine whether thermographic assessment of genital vasocongestion yields similar concordance estimates in women and men when more conservative methods of calculating the association are used.5

Statistical Methods

Within versus Between-Subjects Correlations

The method of calculating concordance has been proposed as one potential source of the gender difference in concordance. Within-subjects correlations estimate concordance at an individual level, that is, whether a person’s genital responses elicited by a set of sexual stimuli are related to subjective appraisals of the same stimuli. Between-subjects correlations estimate concordance at a group level, that is, whether individuals who produce greater genital responses also produce greater estimates of sexual arousal. In our meta-analysis, calculating within-subjects correlations revealed a similar pattern to the results obtained using between-subjects correlations. Concordance estimates were significantly higher for both men and women when within-subjects correlations were calculated. The number of studies on which this result is based, however, is small.

We encourage other researchers to be explicit about how they calculate concordance and to consider the meaning of within- versus between-subjects correlations when deciding what data to collect and which calculation to perform. The former is the most relevant to examining individual integration of psychological and physiological sexual responses, whereas the latter is informative with regard to establishing the concurrent validity of subjective or genital measures of sexual arousal.

Individual Differences

Age

Men showed a positive relationship between age and concordance, but contrary to our prediction, no relationship was found for women. This cannot be attributed to a sampling bias, as the age range was similar for male and female samples. This suggests that any learning processes influencing subjective appraisals of sexual arousal or concordance may occur for men only.

Oral Contraceptive Use

Exogenous hormones such as oral contraceptives (OC, hereafter) are known to affect women’s sexual desire, and are associated with increased sex-hormone binding globulin and reduced free testosterone (Panzer et al., 2006). OC use has variable effects on sexual psychophysiology, with no effects on subjective sexual arousal and perception of genital sensations and variable effects on genital response (Seal, Brotto, & Gorzalka, 2005). Given these differential effects on sexual psychology and physiology, we hypothesized that using OC could affect concordance in women. No effects of oral contraceptive use were observed in our analysis. We note, however, that only one study included in the meta-analysis directly compared concordance in women using OC with those not using OC. Seal et al. (2005) reported a statistically nonsignificant trend toward an effect of OC to increase agreement between subjective and genital sexual arousal (.50 before OC use and .82 after OC use), as well as concordance between perception of genital response and actual genital response (from .11 to .57). It is noteworthy that these investigators obtained such high estimates of concordance using between-subjects correlation calculated from a small sample of 16 women. Further investigation into the effects of OC on concordance and sexual response is needed.

Is the Gender Difference Explained by Learning, Attention, or Information Processing?

Moderators derived from learning, attention, or information processing explanations did not account for the gender difference in concordance. Above, we suggested that learning or attention explanations would link concordance to number of stimulus trials, duration of stimuli, participant age, and whether participants were asked to assess their perceptions of genital change. We also suggested that an information processing explanation would link concordance to whether the sexual stimuli were self-generated fantasies, sexually explicit, or female-centered; participants were instructed to focus on their genital sensations or not; and subjective sexual arousal was assessed contiguously versus after the trial or at the end of the session.

Contrary to these theoretically-derived predictions, the gender difference in concordance was still found when comparing visual and nonvisual modalities, female-centred versus typical sexual films, and erotic versus explicit sexual stimuli. Concordance estimates were not significantly or consistently related to number of stimulus trials, stimulus length, or female age. Only the timing of subjective sexual arousal assessment was significantly related to the gender difference in concordance.

If the gender difference in concordance is robust, as the present data suggest, what then can explain it? The hypotheses we derived from learning and information processing theories were not supported, and methodological factors cannot fully account for men’s higher subjective-genital agreement (or women’s lower subjective-genital agreement). Our finding raises the question of whether low concordance is the norm in women, and what purpose, if any, concordance serves in human sexual functioning. We discuss possible explanations for low female concordance in the next section.

Other Explanations for Low Female Concordance

Is Female Genital Response Reflexively Activated?

Chivers (2005), Laan (1994), and van Lunsen and Laan (2004) have all speculated that female genital response is an automatic reflex that is elicited by sexual stimuli and produces vaginal lubrication, even if the woman does not subjectively feel sexually aroused. Reflexively activated genital response would result in lower concordance overall because genital vasocongestion is not necessarily accompanied by subjective sexual arousal. If female genital response (and thus vaginal lubrication) is indeed reflexively activated, one would expect genital responses to be observed even when women are exposed to nonpreferred sexual stimuli (i.e., sexual stimuli that they do not find subjectively appealing), and under conditions where sexual stimuli are presented subliminally.

Recent research suggests that female genital response can be evoked by a broader array of sexual stimuli than can male genital response. With respect to sexual orientation, heterosexual women show substantial genital responses to both male and female sexual stimuli, whereas heterosexual men show greater genital responses to female stimuli and homosexual men show greater genital responses to male stimuli (Chivers et al., 2004, 2007; Chivers & Bailey, 2005; Peterson, Janssen, & Laan, in press; Suschinsky et al., 2009). Typically, an increase in genital response is evoked by these sexual stimuli even though women report little or no experience of feeling sexually aroused, resulting in lower concordance estimates than are typically found among men.

Further evidence supporting the automaticity of genital responding in women comes from research on the voluntary control of sexual arousal. Automatic genital response would be observed if one were unable to consciously suppress sexual arousal when instructed to inhibit sexual responding. Laan, Scholte, and van Stegeren (2006) reported that women were poor at voluntarily suppressing subjective and genital responding, whereas men showed a greater ability to voluntarily suppress genital responses. Using functional magnetic resonance imaging, the same team suggested that suppression of sexual arousal may be automatic in women but not in men: Men showed increased prefrontal cortex activation during inhibition trials, suggesting conscious effort to suppress responding, whereas women did not (Laan, 2007). Instead, women showed increased anterior cingulate cortex activity (associated with many functions, including modulation of emotional responses) during both inhibition and respond-as-usual trials. This suggests that, during processing of sexual stimuli, brain areas associated with emotional inhibition are activated among women, regardless of the study instructions. Perhaps this is the root of low concordance in women: Genital responses are not affected by involuntary inhibition involving the anterior cingulate cortex, but subjective responses are.

The reflexive activation of vaginal responding by sexual cues may serve a protective function for women. Female genital response entails increased genital vasocongestion, necessary for the production of vaginal lubrication, and can, in turn, reduce discomfort and the possibility of injury during vaginal penetration. Ancestral women who did not show an automatic vaginal response to sexual cues may have been more likely to experience injuries that resulted in illness, infertility, or even death subsequent to unexpected or unwanted vaginal penetration, and thus would be less likely to have passed on this trait to their offspring.

Reports of women’s genital response and orgasm during sexual assault (Levin & van Berlo, 2004) and research showing that women experience genital responses to sexual threat stimuli (Both, Everaerd, & Laan, 2003; Both & Laan, 2007; Laan, Everaerd, & Evers, 1995; Stock, 1983; Suschinsky et al., 2009) suggests that genital responses do occur in women under conditions of sexual threat. That women can experience genital response during unwanted sex or when viewing depictions of sexual assault suggests that women’s vasocongestion response is automatically initiated by exposure to sexual stimuli, whether or not these stimuli are preferred, and without subjective appraisal of these stimuli as sexually arousing or desired.

This notion of automatic vaginal response has implications for research attempting to identify drug treatments for women with sexual arousal disorders. Studies examining the effects of pharmaceuticals such as sildenafil citrate on female sexual response have generally found significant drug effects on genital response, but not subjective sexual arousal (Laan et al., 2001, 2002; Meston & Heiman, 1998; Meston & Worcel, 2002). Because of the low concordance observed in women, we predict that peripherally-acting drugs that only increase genital response will not be effective treatments for female sexual arousal disorder, except in those cases where women experience subjective sexual arousal without concomitant vaginal vasocongestion and lubrication; what Basson, Brotto, Laan, Redmond and Utian (2005) have described as genital sexual arousal disorders.

Is There a Relationship Between Concordance and Sexual Functioning?

Is there any evidence that a gender difference in concordance has any bearing on sexual functioning? That is, does high concordance matter? Based on cognitive models of sexual response one would expect concordant subjective and genital response to be a desirable, or even necessary, state for satisfactory sexual functioning (e.g., Barlow, 1986). Yet, current revisions to definitions of women’s sexual function and dysfunction, recognize the capacity for low concordance in women (Basson et al., 2003). Low concordance between self-reported and genital sexual arousal may be the norm for many women. Subjective-genital agreement calculated within-subjects can vary tremendously, however, such that some women’s reports of sexual arousal are unrelated to their genital responses, or even negatively related, whereas others show large and positive correlations between self-reported sexual arousal and genital vasocongestion (Rellini, McCall, Randall, & Meston, 2005). In other words, it is possible that the lower concordance observed among women, compared to men, is due to the combination of many women with low or even negative correlations between genital and subjective responses with some women who have high correlations. In contrast, men may show less variability in subjective-genital agreement. This variability suggests individual differences can influence female concordance, and raises a host of fascinating questions as to the origins of low concordance among women.

In our meta-analysis, we restricted our analysis of the relationship between concordance and sexual functioning to those studies that included both sexually functional and dysfunctional participants. Although this resulted in a smaller number of studies, and therefore a smaller number of independent correlations for the analysis, the results reflect ideal conditions for making comparisons between sexually functional and dysfunctional persons who are exposed to identical or near-identical study procedures. The results showed no effect of sexual functioning on concordance for either men or women, but the absolute correlations were also notably lower than those obtained for men and women in the other studies included in this meta-analysis. For this reason, we carefully examined the studies included in the sexual functioning analyses for methodological factors that might account for these low concordance estimates.

For the female analysis, the majority of studies showed greater concordance in functional versus dysfunctional samples, but this pattern was obscured when average concordance was calculated across studies. Six of the ten studies that reported correlations for both groups reported higher correlations for sexually functional women (Brauer, Laan, & ter Kuile, 2006; Brauer, ter Kuile, Janssen, & Laan, 2007; Meston, 2006; Palace & Gorzalka, 1992; Payne et al., 2007; Wouda et al., 1998); three studies reported a negative correlation between subjective and genital arousal for sexually functional women (Brotto et al., 2004; Morokoff & Heiman, 1980; Salemink & van Lankveld, 2006) and one reported very similar concordance estimates for functional and dysfunctional women (Meston & McCall, 2005).

Three of the ten studies comparing sexually functional and dysfunctional women reported nonsignificant correlations for many study conditions, resulting in attenuation of the concordance estimate when average correlations were calculated across conditions (Morokoff & Heiman 1980; Palace & Gorzalka, 1990, 1992). To illustrate, Palace and Gorzalka (1990) reported concordance estimates of .5 and .6 for sexually functional women in one stimulus condition, but then reported that the remaining correlations for both functional and dysfunctional women were not significant, and these were coded as correlations of zero according to our coding rules. A significant difference in concordance according to sexual functioning might have been obtained for women if we had the actual correlation coefficient values for all of the conditions.

It is also notable that this analysis included a mix of sexual dysfunctions: four examined dyspareunia (Brauer et al., 2006, 2007; Payne et al., 2007; Wouda et al., 1998); three examined female sexual arousal disorder (Brotto et al., 2004; Meston & McCall, 2005; Morokoff & Heiman, 1980); and the remainder used mixed samples with sexual dysfunctions (Meston, 2006; Palace & Gorzalka, 1990, 1992; Salemink & Van Lankveld, 2006). Ideally, analyses would be restricted to homogeneous dysfunction groups, because the relationship between concordance and sexual dysfunction may depend on the nature of the disorder. For example, several studies have reported lower concordance among women with female sexual arousal disorder (Morokoff & Heiman, 1980, Laan et al., 2008; Palace & Gorzalka, 1992) but significantly greater concordance among women with hypoactive sexual desire disorder, compared to functional women (Arnow et al., 2009).

For the male sexual functioning analysis, two factors may account for the lower concordance estimates found for both sexually functional and dysfunctional samples. First, three of the five studies used contiguous assessment of self-reported arousal (Abrahamson et al., 1985; Beck et al., 1983; Cranston-Cuebas, Barlow, Mitchell, & Athanasiou, 1993), which results in lower concordance among men. Second, samples of sexually dysfunctional men were, on average, older than the sexually functional men and, in our meta-analysis, age was positively related to concordance among men, which would have reduced the possibility of observing a difference between the two groups.

To date, no research focusing on sexual functioning has examined concordance between subjective and genital sexual arousal as a study outcome. Do women who report better sexual functioning also demonstrate higher concordance between psychological and physiological responses? Indirect evidence suggests this might be the case. Adams et al. (1985) reported significant concordance estimates for frequently orgasmic women. Similarly, Brody, Laan, and van Lunsen (2003) and Brody (2007) have reported that women who show greater concordance also report greater frequency of orgasm during penile-vaginal intercourse. Concordance may be a useful means of assessing integration of sexual information among women, and may prove to be a useful correlate of sexual functioning. We discuss the broader literature on integration of mind and body in the next section.

Integration of Mind–Body Awareness: Interoceptive Awareness and Sexual Functioning

Concordance between perception of genital response and actual genital sexual arousal is an index of interoceptive awareness, that is, the ability to accurately perceive physiological changes. Emotion theories, for example the James-Lange theory, implicate the perception of physiological cues in the appraisal and labelling of emotional states, such as anxiety (James, 1894; Lange, 1885). Research on the relationship between interoceptive awareness and other emotional states may, therefore, provide some insight into the nature of this form of psychophysiological awareness with respect to sexual arousal.

Certain anxiety disorders, such as panic disorder, have been associated with high levels of interoceptive awareness; for example, people with panic disorder show an enhanced awareness of cardiac cues in comparison to people without panic disorder (Ehlers & Breuer, 1992). Higher interoceptive awareness is associated with stronger heart rate responses to pleasant and unpleasant stimuli and with higher arousal ratings (Pollatos, Herbert, Matthias, & Schandry, 2007), as well as significantly higher electrical brain activity associated with emotional processing (P300 amplitudes; Pollatos, Kirsch, & Schandry, 2005). The association between arousal ratings and interoceptive awareness has been replicated in both high and low arousal states, as well as positive and negative emotional states (e.g., feeling nervous is high arousal with negative valence, while feeling content is low arousal with positive valence; Barrett, Quigley, Bliss-Moreau, & Aronson, 2004). These results suggest that persons with higher interoceptive awareness are more sensitive to cues of sympathetic nervous system arousal, a key autonomic component of sexual arousal (McKenna, 2002).

Gender differences in interoceptive awareness have been observed. Men show slightly greater interoceptive awareness using heart-rate detection tasks (Jones, 1995). There is also a gender difference in response to psychological stress, such that interoceptive awareness of heart rate decreases among women with increased stress, whereas men show no change (Fairclough & Goodwin, 2007). Pennebaker and Roberts (1992) have reported that men rely on interoceptive information to define their emotional state whereas women are more apt to attend to external, situational cues. It was speculated that gender-typical models of emotional processing may apply, such that women’s appraisals may be more cognition-dependent, whereas men’s appraisals are more consistent with James-Lange theory.

If Pennebaker and Roberts’s (1992) reasoning is correct, then men may have high sexual concordance because their subjective sexual arousal is highly influenced by their perception of the internal sensory cues that indicate the extent of their penile erection (e.g., fullness in the penis and groin, tightening of suspensory ligaments). Women, on the other hand, are more likely to be influenced by their attitudes, beliefs, and values regarding sexuality (Baumeister, 2000), as well as immediate contextual factors such as sexual stimulus properties and their appraisals of the sexual stimuli. These notions suggest that manipulating the internal or external information available to women and men could influence the degree of concordance that is observed. Increasing the number of contextual cues should increase female concordance, and reducing men’s awareness of their penile responding should reduce their concordance. The greater concordance we found for women in studies that included stimulus variation may be related to an increase in contextual information provided by varied stimulus content and modality.

In both men and women, a brain region implicated in interoceptive awareness (right insula; Critchley, Wiens, Rotshtein, Ohman, & Dolan, 2004) has also been shown to be active during sexual response to visual sexual stimulation (Karama et al., 2002; Park et al., 2001; Stoleru et al., 1999). A gender difference in insular activation during sexual arousal has also been reported, with men showing greater activation than women (Gizewski et al., 2006; Laan et al., 2006). Notably, insular activity is stronger during women’s ovulatory phase (Gizewski et al., 2006) and weaker in men with androgen insufficiency (Redouté et al., 2005), suggesting androgens play a role in activation of this brain region during sexual stimulation.

It is noteworthy that the dependent measure in much of the research on interoceptive awareness—accuracy in a heart-rate detection task—involves perception of a physiological cue that is identical for women and men, yet the pattern of results is similar to what we have obtained examining the relationship between subjective and genital sexual arousal using different psychophysiological measures. Collectively, these results suggest that the gender difference we have obtained in concordance is not limited to genital perceptions, and provides a theoretical framework on which further research on integration of physiological and psychological components of sexual response might be based.

Implications for Future Research on Sexual Response

A gender difference in concordance has implications for the design and interpretation of future research on sexual response. First, sexual response research on women cannot exchange self-report or genital measures of sexual arousal, particularly when the latter is measured using photoplethysmography, because one may find very different associations depending on which aspect of sexual response is assessed. In men, however, these aspects of sexual arousal are sufficiently highly and positively correlated that assessing self-reported sexual arousal is informative if genital measures of sexual arousal are not available and there is no motivation to conceal sexual arousal.

Another implication of our findings has to do with clinical forensic assessments of men who have committed sexual offences or engaged in other problematic sexual behavior. Genital responding is informative about a man’s subjective experience of sexual arousal, and this is very helpful in situations where the man denies sexual interests in illegal targets or activities. Thus, phallometric testing is useful in the assessment of men who have sexually offended against children but deny any sexual attraction to children, or to assess men who have committed rape but deny any sexual interest in coercive sex (Lalumière, Harris, Quinsey, & Rice, 2005; Seto, 2008).

In contrast, a woman’s genital responding might reveal little about her sexual interests. If a woman showed a genital response to depictions of children, it might indicate that she was sexually interested in children, but it might also reflect the nonspecificity of female genital responding observed by Chivers and her colleagues with respect to gender and, to a lesser extent, species (Chivers et al., 2004, 2007; Chivers & Bailey, 2005; Suschinsky et al., 2009). Thus, genital assessments of women for forensic purposes—such as the assessment of female sex offenders—may not be clinically informative. Consistent with this possibility, Cooper, Swaminath, Baxter, and Poulin (1990) reported a case study on the psychophysiological assessment of a female sex offender with child victims. This woman did not genitally discriminate between sexual stimuli depicting children or adults, or between depictions of coercive versus consensual sex.

Finally, the results of this meta-analysis have implications for our understanding of sexual functioning. The gender difference in concordance may be a manifestation of a broader gender difference in interoceptive awareness. The relationships observed among gender, interoception, and use of internal and external cues in the emotions literature may be very helpful for the development of gender-specific models of sexual functioning. Research on male sexual functioning suggests that sexually functional men may possess greater interoceptive awareness than men with erectile problems (Cranston-Cuebas et al. 1993). Awareness of penile erection may facilitate further physiological arousal through a positive feedback process for functional men; in contrast, men with erectile disorder are less aware of their erectile responses, and positive feedback is not activated.

Nobre et al. (2004), however, reported that among sexually functional men, interoceptive awareness does not predict accuracy in estimating erectile response, though variation in interoceptive awareness in this group may have been too limited for an effect to emerge. Similarly, we found no differences in male concordance relating to sexual functioning in our meta-analysis. If this effect is reliable, the role of perception of penile response in the development of sexual dysfunction may need to be reconsidered.

It is unclear whether a relationship between interoceptive awareness and female sexual functioning would be found; the results reported by Adams et al. (1985) and Brody (2007) suggest this might be a fruitful line of research to pursue. As a group, men may be more likely to rely on physiological cues when formulating an appraisal of their sexual arousal, whereas women may demonstrate greater variability in this tendency, resulting in more variable sexual functioning.

A host of factors that were not explored in this meta-analysis, such as individual differences in sympathetic tone and cognitive schemas relating to mind–body integration, may impact upon interoceptive awareness. For example, negative body image, a factor implicated in women’s sexual functioning (Nobre & Pinto-Gouveia, 2006), is associated with lower nonsexual interoceptive awareness in women (Tylka & Hill, 2004). Further research examining interoceptive awareness may also prove fruitful in identifying relevant factors and viable therapy targets for improving sexual functioning in women and in men.

Limitations

A common criticism of sex research is that participants are not randomly sampled from the population, thereby limiting the generalizability of findings (for a review, see Brecher & Brecher, 1986). Compared to nonvolunteers, volunteers for sex research tend to be more sexually experienced, have more liberal sexual attitudes, and are more interested in sexually explicit materials (e.g., Morokoff, 1986; Saunders, Fisher, Hewitt, & Clayton, 1985; Wolchik, Braver, & Jensen, 1985). Brecher and Brecher (1986) argued, however, that valid conclusions can still be made from sex research through the use of matched comparison groups, cumulative findings from samples that are selected to be as diverse as possible, exclusion of confounding variables, and minimization of volunteer bias. This meta-analysis presents a quantitative synthesis of a large and diverse set of studies reporting data on subjective-genital agreement.

Concordance was measured using a correlation coefficient in this meta-analysis because this estimate of concordance is overwhelmingly reported in the literature. A correlation captures agreement in the direction of self-reported and genital sexual arousal. This means that, when concordance is high, change in subjective response is mirrored by change in genital response. However, a correlation does not capture the magnitude of changes in self-reported and genital arousal; thus, large changes in subjective response mirrored by small changes in genital response would still yield a large positive correlation. Sakheim et al. (1985) explored the distinction between direction and magnitude (what they described as intensity) of sexual responses and suggested using agreement ratios. In their paper, male sexual arousal showed directional agreement, such that increases in sexual responses resulted in greater agreement, and partially supported intensity agreement if men were able to see their erections. Interestingly, stronger erections resulted in lower intensity agreement, reflecting the fact that men often achieved full erection before they reached maximum subjective sexual arousal. This method of examining concordance has not been used in studies of women and may provide more insight into the nature of sexual response agreement.6

Another limitation of using correlation coefficients is that one cannot simultaneously examine subjective-genital agreement within individuals as well as within groups. In addition, sexual response data may violate the independence assumptions of correlation, linear regression, and repeated measures analysis of variance techniques. Hierarchical linear modelling, on the other hand, allows researchers to examine the agreement between contiguously assessed genital and subjective sexual arousal, and to use the coefficients that model this relationship to compare groups and to examine the potential effects of individual differences as moderators of these relationships (Rellini & Meston, 2006; Rellini et al., 2005). A disadvantage of hierarchical linear modelling is that the coefficients it produces (slope and intercept) are not readily interpretable, unlike correlation coefficients. This suggests that hierarchical linear modelling and correlational analyses provide complementary information about subjective-genital agreement.

As is the case for all sexual psychophysiology research to date, the results and conclusions we draw are based on data from Western industrialized populations; the gender difference in concordance may be limited to women and men in Canada, the United States, Australia, and northwestern Europe. It remains an empirical question whether sociocultural factors moderate the gender difference in concordance and, if so, in what direction. Another limitation to the generalizability of our findings is that the large majority of studies of men used circumferential penile gauges and most of the studies of women used VPA as measures of genital response. Our results suggest that thermography, for example, may produce higher estimates of concordance than VPA, and thus different methods of genital arousal assessment may produce different concordance estimates.

The ecological validity of laboratory research on concordance must also be considered. Sexual psychophysiology is conducted in a laboratory environment where sexual arousal is induced using various types of sexual stimuli, a situation that is very different from the usually private experience of an actual sexual encounter. The response patterns observed in the laboratory may not necessarily reflect those outside the laboratory (Rowland, 1999). Women’s sexual response may, for example, be differentially-affected by laboratory procedures, resulting in the observed gender differences. With the development of ambulatory psychophysiological equipment, more naturalistic assessments of women’s sexual concordance will be possible: In men, for example, genital responses measured in the laboratory are positively correlated with those measured in the natural environment using a portable penile plethysmograph (Rea, DeBriere, Butler, & Saunders, 1998).

A final limitation concerns our analytical strategy. To conduct the moderator analyses, we chose restrictive inclusion criteria in an effort to reduce other sources of variation. In some cases, this resulted in sample sizes that were too small for powerful statistical comparisons. The less reliable results that we obtained using smaller sets of studies should therefore be interpreted as directions for future investigations of factors influencing concordance, rather than conclusive evidence regarding the moderators of concordance in women and men. In addition, our analyses were univariate in nature. Although we did not have explicit hypotheses regarding moderator interactions, such analyses would have been helpful to determine the combination of methodological parameters that maximize concordance; for example, what happens when women are exposed to varied stimulus content, visual stimuli, and their genital response is assessed using thermography? Although we believe that such analyses are better suited to individual experimental studies, the accumulation of studies on concordance will eventually allow multivariate meta-analyses.

Final Comment

We have focused on explanations for low female concordance in our discussion of these results, but one might also wonder why male concordance is so high. From this perspective, the typically low concordance observed among women is the norm, and the typically high subjective-genital agreement exhibited by men needs to be explained. Research on interoception and emotion suggests that awareness of internal sensations and access to an external peripheral cue–such as awareness of a penis in different states of erection–can increase the agreement of psychological and physiological responses. If this explanation is correct, and male concordance is a by-product of being able to see and feel changes in penile tumescence, then experimental research that restricts this feedback (e.g., by placing a barrier that prevents the participant from seeing his penis or lying in a position that reduces tactile feedback from an erection) should decrease concordance. The few studies that have implemented such techniques, however, continue to report similar accuracy estimates of erection, regardless of body position (Schaefer et al. 1976) or access to visual feedback (Sakheim et al., 1985).

Another possible explanation for the high concordance observed among men is that both psychological and genital sexual arousal are necessary for men to engage in sexual intercourse: Subjective feelings of sexual arousal motivate sexual behavior, while penile erection is necessary for penetration. From an adaptationist perspective, high concordance might have been selected for among our male ancestors, such that men with high concordance were more likely to achieve intromission and reproduce than men who had low concordance and felt sexually aroused without an accompanying erection, or developed erections without the accompanying subjective sexual arousal to motivate them to seek sexual intercourse.

Unlike men, however, concordance is not necessary for women to engage in sexual intercourse. In fact, the more conservative sexual strategy (in terms of greater choosiness regarding sexual partners, having fewer sexual partners and longer-term relationships) adopted by many women might be compromised by high concordance (see Symons, 1979). From this perspective, partial independence of psychological and genital processes may aid female sexual decision-making by reducing arousal-dependent appraisal of suitable mates (for an elaboration of this idea, see Laan, Everaerd, van der Velde et al., 1995; Suschinsky et al., 2009). For women’s sexual pleasure, however, sexual concordance may indeed be very important. Future research could test these ideas by examining the relations among subjective-genital agreement and individual differences in sexual history, sexual attitudes, sexual responsiveness, and sexual functioning.

Footnotes
1

By preferred, we mean sexual stimuli that correspond to the participant’s self-reported sexual interests; thus, the preferred stimulus for heterosexual men and homosexual women would depict women, while the preferred stimulus for heterosexual women and homosexual men would depict men.

 
2

We do not exhaustively list all the measurement methods available, focusing instead on those methods that were most commonly used in the studies we included in this meta-analysis.

 
3

In phallometry, penile responses can be recorded as mm change in penile circumference or in cc change in penile volume, both of which are physiologically and behaviourally meaningful units of response.

 
4

Given the number of moderators we examined in this meta-analysis, we would expect one of these two significant findings to be due to chance. We did not correct for number of comparisons in selecting our statistical significance level because our meta-analysis was designed to be exploratory in terms of examining potential influences on subjective-genital agreement. The fact that only two moderators were identified out of the many examined is consistent with our conclusion that the gender difference in concordance is real and robust.

 
5

One could argue that including nonpreferred sexual stimuli also artificially increases subjective-genital agreement, at least for men, as their responses to nonpreferred stimuli might be no different from their response to neutral stimuli (e.g., stimuli depicting men only for heterosexual men). This is not the case for women, however, as research by Chivers and others has shown that women do genitally respond to sexual stimuli they do not prefer and that they do not find subjectively arousing.

 
6

This may be difficult using VPA or related forms of genital response measurement in women, however, as we do not know what constitutes a maximum genital response in women, or whether there is a maximum genital response in women equivalent to a full erection in men. Measuring VPA at orgasm as a means of quantifying ‘maximum genital arousal’, for example, is not possible because pelvic floor contractions during orgasm create artifacts in the signal, distorting the vasocongestive response.VPA is measured on an ordinal scale, whereas penile circumference or volume change is measured on a ratio scale that allows for the meaningful calculation of agreement ratios.

 

Acknowledgements

Sincerest thanks to Amy K. Bach, Stephanie Both, Marieke Brauer, Lori A. Brotto, John E. Desmond, Ann N. Elliot, Michael Exton, George A. Gaither, Cynthia A. Graham, Anita Islam, Erick Janssen, Tuuli M. Kukkonen, Elizabeth J. Letourneau, Katie M. McCall, Cindy M. Meston, Kimberly A. Payne, Nicole Prause, David L. Rowland, Rebecca L. Schacht, Moniek M. ter Kuile, Jacques J. D. M. van Lankveld, Risa B. Weisberg, and Jan C. Wouda for providing additional data for this meta-analysis. Preparation of this work was supported by postdoctoral fellowships from the Canadian Institutes of Health Research, the Social Sciences and Humanities Research Council of Canada, and the Ontario Council on Graduate Studies/Ontario Women’s Health Council awarded to Meredith L. Chivers. Parts of this article were presented at the 2009 meeting of the Society for Sex Therapy and Research, Arlington, VA, the 2008 meeting of the Canadian Sex Research Forum, Montréal, Canada, and the 2005 meeting of the International Academy of Sex Research, Ottawa, ON, Canada.

Open Access

This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Copyright information

© The Author(s) 2009