Archives of Sexual Behavior

, Volume 35, Issue 6, pp 667–684

Gender Development in Women with Congenital Adrenal Hyperplasia as a Function of Disorder Severity

Authors

    • NYS Psychiatric Institute/Department of PsychiatryColumbia University
  • Curtis Dolezal
    • NYS Psychiatric Institute/Department of PsychiatryColumbia University
  • Susan W. Baker
    • Department of PediatricsMount Sinai School of Medicine
  • Anke A. Ehrhardt
    • NYS Psychiatric Institute/Department of PsychiatryColumbia University
  • Maria I. New
    • Department of PediatricsMount Sinai School of Medicine
Original Paper

DOI: 10.1007/s10508-006-9068-9

Cite this article as:
Meyer-Bahlburg, H.F.L., Dolezal, C., Baker, S.W. et al. Arch Sex Behav (2006) 35: 667. doi:10.1007/s10508-006-9068-9

Abstract

Prenatal-onset classical congenital adrenal hyperplasia (CAH) in 46,XX individuals is associated with variable masculinization/defeminization of the genitalia and of behavior, presumably both due to excess prenatal androgen production. The purpose of the current study was threefold: (1) to extend the gender-behavioral investigation to the mildest subtype of 46,XX CAH, the non-classical (NC) variant, (2) to replicate previous findings on moderate and severe variants of 46,XX CAH using a battery of diversely constructed assessment instruments, and (3) to evaluate the utility of the chosen assessment instruments for this area of work. We studied 63 women with classical CAH (42 with the salt wasting [SW] and 21 with the simple virilizing [SV] variant), 82 women with the NC variant, and 24 related non-CAH sisters and female cousins as controls (COS). NC women showed a few signs of gender shifts in the expected direction, SV women were intermediate, and SW women most severely affected. In terms of gender identity, two SW women were gender-dysphoric, and a third had changed to male in adulthood. All others identified as women. We conclude that behavioral masculinization/defeminization is pronounced in SW-CAH women, slight but still clearly demonstrable in SV women, and probable, but still in need of replication in NC women. There continues a need for improved instruments for gender assessment.

Keywords

Congenital adrenal hyperplasiaGender-related behaviorGender identityGender dysphoriaAndrogen effectsDisorders of sex developmentHermaphroditism

Introduction

Congenital adrenal hyperplasia (CAH) in 46,XX individuals denotes a family of syndromes of disorders of sex differentiation (DSD) associated with excessive prenatal and postnatal androgen production due to defective genes that regulate the production of enzymes involved in adrenal steroidogenesis (Grumbach, Hughes, & Conte, 2003; New, 2003). In about 90% of the cases, the affected enzyme is 21-hydroxylase (21-OH). 21-OH deficiency impairs the synthesis of cortisol (a glucocorticoid) and aldosterone (a mineralocorticoid), and their unused precursor hormones enhance the production of adrenal androgens.

Three major subtypes of clinical severity of 21-OH deficiency CAH are distinguished, which depend on the specific type of the underlying molecular gene defect. Most severe is the salt-wasting (SW) variant, in which the synthesis of both cortisol and aldosterone is fully blocked or markedly inhibited, with potentially lethal consequences, if untreated, due to glucose and sodium dysregulation (“salt-wasting” condition). The simple virilizing (SV) subtype represents an intermediate degree, in which the synthesis of cortisol is more markedly affected than the synthesis of aldosterone. In both the SW and SV subtypes, the prenatal androgen excess causes variable prenatal masculinization of the genitalia (genital ambiguity), on average more markedly so in the SW subtype. This is followed by variable postnatal symptoms of continuing virilization such as hirsutism, acne, and menstrual irregularities, if the hormone deficiency is left untreated. Because the two forms with genital ambiguity at birth were the first to be recognized, they are called the “classical” (prenatal-onset) variants of CAH. The “non-classical” (NC) (late-onset) subtype is the mildest, without genital ambiguity at birth, but possible onset of clinical symptoms of virilization in childhood or later.

Since the benchmark publications of Ehrhardt and Money and their co-workers in 1968 (Ehrhardt, Epstein, & Money, 1968; Ehrhardt, Evers, & Money, 1968) documented shifts from female-typical toward male-typical behavior in groups of 46,XX individuals with classical CAH, a number of studies have replicated these findings, although there is considerable interindividual variation ranging from female-typical to highly atypical behavior (earlier findings cited in Meyer-Bahlburg, 2001; recent studies: Hall et al., 2004; Hines, Brook, & Conway, 2004; Long, Wisniewski, & Migeon, 2004; Nordenström, Servin, Bohlin, Larsson, & Wedell, 2002; Wisniewski, Migeon, Malouf, & Gearhart, 2004). The cause of the behavioral masculinization/defeminization is also widely assumed to be the excessive prenatal androgen production. The argument is derived from experimentation on mammals ranging from rodents to non-human primates that has demonstrated causal contributions of pre- and/or perinatal androgens to the sexual differentiation of genitals, brain, and subsequent behavior (Arnold, 2002; DeVries & Simerly, 2002; Goy & McEwen, 1980; Wallen, 2005). The behavioral effects are broad and involve most behavioral domains known to show sex differences (e.g., juvenile play behavior, adult sexual behavior, parenting behavior, aggressive behavior, sex partner preference, and spatial perception), although the effect sizes vary considerably. Behavioral differences in the same domains have been shown between girls and women with CAH and control females (Hines 2004; Meyer-Bahlburg, 2001).

In the present study, we use the short term “behavioral masculinization/defeminization” for the shift from female-typical towards male-typical behavior to indicate that both suppression of female-typical and development of male-typical behaviors are likely to be involved (Collaer & Hines, 1995; Goy & McEwen, 1980), possibly via different mechanisms that are as yet insufficiently specified for humans. Originally, these terms were coined for the effects of experimental prenatal and perinatal hormone manipulations on specific and relatively stereotypic aspects of sexual behavior in non-human mammals (Hines, 2002; Wallen & Baum 2002), but have been expanded to other behavioral domains. There is as yet no consensus as to which specific gender-behavior changes in 46,XX humans indicate defeminization and which masculinization.

To support the hormone-behavior relationship in humans, however, we have to rely on indirect correlational evidence from non-experimental follow-up studies, as direct hormonal experimentation with normal human pregnancies and healthy infants is ethically not justifiable because of the health risks involved. A number of studies have shown that androgen levels in 46,XX fetuses or newborns with CAH are excessive and can be higher than those in normal comparison males (Carson et al., 1982; Pang, Levine, Chow, Faiman, & New, 1979; Pang et al., 1980, 1985; Wudy, Dörr, Solleder, Djalali, & Homoki, 1999), but such studies are typically limited to one time point (amniocentesis, newborn period), do not directly measure androgen levels in the fetus, but instead make inferences from androgen levels in amniotic fluid or the umbilical cord, and do not cover the entire prenatal and/or neonatal period. In addition, no one has studied later gender-differentiated behavior in these children as a function of the indirectly measured pre/perinatal hormone levels. Several investigators have attempted to relate indirectly assessed pre/perinatal androgen levels (in some studies even based on the blood levels of the pregnant mother) in female non-CAH fetuses or newborns and occasionally found associations with selected gender-related cognitive abilities or behaviors later (for references, see Baron-Cohen, Knickmeyer, & Belmonte, 2005; Hines, 2004; see also van de Beek, 2005; van de Beek, Thijssen, Cohen-Kettenis, van Goozen, & Buitelaar, 2004), but the correlations found to date are at best modest and inconsistent. This is not surprising given that the androgen levels seen in those studies are typically well below pre/perinatal androgen levels found in CAH.

The argument supporting a causal role of androgens in the development of behavioral masculinization/defeminization in girls and women with classical CAH has been strengthened by the demonstration of group differences between mild and more severe forms of CAH (because the more severe forms have more androgen excess): between SV and SW (Berenbaum, Duck, & Bryk, 2000; Dittmann et al., 1990; Hall et al., 2004; Long et al., 2004; Wisniewski et al., 2004; Zucker et al., 1996), and between SV and two degrees of SW in children (Berenbaum et al., 2000; Hall et al., 2004; Nordenström et al., 2002), with the classification based either on endocrine or molecular-genetic criteria. The argument that the hormonal effects take place primarily during fetal development, as in the rhesus monkey, which is the most frequently studied non-human primate, was already made by Ehrhardt et al. (1968) on the basis of the comparison of early (postnatally) treated and late (i.e, often not before adulthood) treated CAH females. Berenbaum et al. (2000) garnered further support for the prenatal hypothesis by showing in CAH girls that somatic indicators of postnatal androgen effects were not associated with gendered behavior, while (inferred) prenatal androgen excess was related.

The present study had three different purposes. First, we wanted to extend the investigation of gender-related behavior of 46,XX CAH to the as yet unstudied NC subtype, which was expected to show less, if any, behavioral masculinization/defeminization than the two classical variants of CAH. Secondly, we wanted to replicate the previous findings of SW-SV differences with an assessment battery that included diversely constructed detailed instruments for the assessment of gender-related behavior in adulthood and, retrospectively, in childhood, preferably developed for populations other than persons with DSD, so that we could demonstrate the robustness of CAH findings across interviews and written self-report forms of different types. Thirdly, we also wanted to evaluate the utility of such assessment instruments for differentiating between DSD subtypes of different degrees of severity, given that this literature is less well developed than the child literature (Zucker, 2005). We were particularly interested in comprehensive instruments that sampled diverse gender-related behaviors in childhood (retrospectively) and/or adulthood and provided also relatively global summary scales of behavioral masculinization/defeminization, which are very desirable for clinical characterization of persons with DSD.

Method

Participants

During an initial pilot phase of this project, a small number of women with CAH was recruited from two pediatric endocrine clinics in New York City (Meyer-Bahlburg et al., 2000, 2003). Subsequent data collection for the main study was limited to the senior endocrine author's (M.I.N.) clinic, but data from both study phases were combined for the final analysis. Eligible were all adult women with CAH due to 21-hydroxylase deficiency for whom the molecular genetics of the 21-hydroxylase gene had been determined and who spoke English. Some of the CAH women had been in touch with the clinic only for diagnostic purposes, whereas others had received long-term care and follow-up. The recruitment began with an initial contact letter from the clinic director or the clinician with whom the woman was most familiar that described the study procedures and invited the woman's participation. The letter included a brief Answer Sheet covering some demographic and contact information and an indication of willingness to participate, refusal to participate, or request for additional information. When no response was received within two weeks, attempts were made to contact the woman by phone. When the address was invalid because of change of residence, much effort was spent on identifying the new correct address.

Geographically, the women were spread over the entire United States and other continents. Transportation reimbursement was provided for women within the continental U.S. Willingness to travel to New York City was severely impacted by the World Trade Center attack of September 2001.
Table 1

Recruitment breakdown of CAH women from the primary source clinic by CAH subtype

Category

SW [n (%)]

SV [n (%)]

NC [n (%)]

Eligible by record

94 (100)

58 (100)

230 (100)

Miscellaneous reasons for exclusiona

4 (4)

5 (9)

6 (3)

Moved and not traceable

15 (16)

14 (24)

47 (20)

Refused participationb

14 (15)

9 (16)

51 (22)

Agreed to participate but unable to do so during data-collection periodb

25 (27)

9 (16)

44 (19)

Participated

36 (38)c

21 (36)

82 (36)

aIncluding among others n=2 deaths and n=11 living outside of North America.

bStated reasons included fear of travel following the World Trade Center attack, work pressures, childcare needs, etc.

cn=2 were not included in the SW group for statistical data analysis: one patient because he had changed to living as a man and was therefore not administered questionnaires and interviews designed for women; the other because of cognitive limitations which interfered with the standard administration of assessment instruments. The final analysis sample of n=40 also included 6 SW women from the second source clinic, which was only involved in the pilot phase.

Table 1 provides an overview of the recruitment outcome of CAH women from the primary source clinic by CAH subtype. Between 15 and 24% of women could not be found; between 9 and 22% clearly refused participation; another 16 to 27% agreed to participate but were unable to do so during the period of data collection. Thus, only 36 to 38% of the eligible women from the primary source clinic made it into the study. The total study sample for CAH women included 42 SW women, 21 SV women, and 82 NC women. Non-CAH control women (designated COS) consisted of sisters and female cousins of participating CAH women. Participating CAH women were asked to inform their sisters or female cousins about the study and/or to give us permission to contact them for study invitation (or to have them contact us). A total of 24 such control women could be enrolled in the study during the period of data collection. Study procedures were approved by the appropriate institutional review boards, and all participants gave written informed consent.

Measures

All women underwent an 8–10-hr protocol (often spread out over several days) of standard self-report questionnaires, psychometric tests, physical examinations, and systematic interviews, the latter involving both quantitative, semi-structured, blindly administered interview schedules, and a final qualitative, open-ended, non-blindly administered interview. Interviewers were clinical psychologists specifically trained in these procedures. Written gender questionnaires were administered earlier than gender interviews. Interviews were audiotaped for quality monitoring by the interviewer trainer associated with this study and, in the case of qualitative interviews, for content analysis. For the assessment of gender-related variables, we selected a battery of measures that covered a wide range of behaviors and preferences in childhood, adolescence, and adulthood, represented diverse item formats, and showed strong sex differences. However, psychometric data are based on various non-clinical and clinical convenience samples, and none of these instruments are normed on nationally representative samples. The current report, which focuses on the gender-related variables, is based on the quantitative interviews and questionnaires listed below. (The study also included a detailed assessment of sexual orientation, which will be addressed in a separate publication.)

For some of the instruments, we had data available from a preceding study of ours on the long-term aftereffects of prenatal diethylstilbestrol (DES) exposure (Lish, Meyer-Bahlburg, Ehrhardt, Travis, & Veridiano, 1992). We used the 67 DES-unexposed control women and 60 DES-unexposed control men from that study to re-demonstrate, where applicable, the discriminant validity of the respective scales in terms of capturing sex differences (see Results below).

Gender-related variables recalled from childhood

This category refers to retrospectively assessed childhood behaviors and preferences in which the sexes differ, such as rough-and-tumble play, preference for male or female playmates, interest in infants, sports orientation, vocational interests, etc.

Childhood Play Activities Questionnaire (CPAQ)

The CPAQ (Grellert, Newcomb, & Bentler, 1982) is a written self-report questionnaire including two play activity check lists, one covering ages 5 to 8 years, and another ages 9 to 13 years, selected for activities liked best by boys or best by girls. (Example: playing house: no/yes.) Scales derived by monotonicity analysis showed moderate internal consistency and discriminant validity. A Masculinity Scale (M SUM) was formed by summing the four masculine-type scales, and a Femininity Scale (F SUM) similarly by summing the four feminine-type scales. Kuder-Richardson reliability coefficients for all scales (both genders combined) ranged from .58 to .84, median .70. The scales discriminated significantly between heterosexual males and females. In addition, the CPAQ includes a set of 13 play-related items that assess attitudes, behavior, and parent-child interaction (Example: “avoided physical fights”: 1 Always/2 Often/3 Sometimes/4 Never). Of these 13 items, only the 5 items with significant gender differences are included in the current report.

Recalled Childhood Gender Questionnaire-Revised (RCGQ-R)

The RCGQ-R (Meyer-Bahlburg et al., in press) is a recently revised version of the written self-report form, Recalled Childhood Gender Questionnaire (RCGQ; Mitchell & Zucker, 1991; Zucker et al., in press). The RCGQ-R focuses particularly on gender-related behaviors that are part of the clinical picture of gender identity disorder of childhood. The RCGQ-R yields three scales labeled Gender Role, Physical Activity, and Cross-Gender Desire. (Example: “10. In fantasy or pretend play, I took the role: a. Almost always of boys or men/b. Usually of boys or men more than girls or women/c. Equally boys/men and girls/women/d. Usually of girls or women more than boys or men/e. Almost always of girls or women/f. I did not do this type of pretend play.”) The first-factor gender scale of the older version (Zucker et al., in press) showed an internal consistency (Cronbach's α) of .92 and a highly significant sex difference (Cohen's d=1.74).

Masculine Gender Identity Scale in Women (MGI-F)

The MGI-F (Blanchard & Freund, 1983) is a written self-report questionnaire to measure masculine gender identity, including male sex-typed behaviors, among women. Part A (20 items) was intended to assess masculine gender identity, in 16 items for childhood as expressed in play and peer preferences, fantasies, and the wish to have been born a male (Example: 4. Between the ages 6 and 12, did you like to do jobs or chores which are usually done by men: 0 No/1 don't remember/2 Yes), in two items for adolescence, and in another two items for adulthood. Part A had an internal consistency (Cronbach's α) of .89 and discriminated well (p≤.001) between heterosexuals, homosexuals, and transsexuals.

Gender Role Assessment Schedule-Adult (GRAS-A)

The GRAS-A (Ehrhardt & Meyer-Bahlburg, 1984) is a semi-structured interview with multiple probes, which requires the interviewer to rate the participants' responses on preformulated rating scales of variable length. This schedule includes both retrospective childhood items and items covering adulthood behavior and preferences. The interview was originally designed for single-item analysis, but secondarily derived multi-item scales have been used in a number of studies (e.g., Meyer-Bahlburg, Feldman, Ehrhardt, & Cohen, 1984; Meyer-Bahlburg et al., 1996). As response-scale options differed between items, all were recoded into a standard 1–5 format prior to scale calculations. To cover recalled childhood gender-related behavior in the current project, we utilized two versions of a childhood summary scale. One was a 6-item scale, ChildSum1, applicable to both genders, and involving items 01 = past toy play (1 Boys' toys and activities exclusively / … / 5 Girls' toys and activities exclusively); 02 = past parenting play (0 Never or rarely / … / 3 Very frequently); 03 = past rough-and-tumble play (0 Never or rarely / … / 3 Very frequently); 06 = past friends' sex (1 Spent all time with boys / … / 5 Spent all time with girls); 12 = past physical activity level (1 Extremely physically active / … / 5 Generally physically inactive); and 20 = a composite of items of pretend and acting roles (1 Male exclusively / … / 5 Female exclusively). Cronbach's α=.85 for an adult sample involving male and female controls from the prenatal DES project. The other summary scale was a female-only 8-item scale, ChildSum2, which combines the 6-item scale above with two female-specific items, namely, 64 = labeled tomboy by others (0 Never/1 Maybe/2 Definite); 65 = labeled tomboy by self (0 Never/1 Maybe/2 Definite). Cronbach's α=.72 for women controls from the DES project, lower than for ChildSum1 because of the reduced variability within a female-only control sample. Sex differences are reported in the Results section below.

Gender-related variables from adulthood

Gender Role Assessment Schedule-Adult (GRAS-A)

From the GRAS-A (Ehrhardt & Meyer-Bahlburg, 1984) described above, we utilized two versions of an adulthood-items summary scale to represent gender-related behavior of adulthood. One was AdultSum1, a 9-item scale that can be used with both genders, including the items 04 = gender of friends (1 All friends are men / … / 5 All friends are women); 07 = physical activity/leisure, current (1 Extremely physically active / … / 5 Generally physically inactive); 30 = wanting or already having children (1 No/2 Uncertain/3 Yes); 31 = satisfying/being mother versus father (1 Father/2 Uncertain or both/3 Mother); 32 = liking infant care (1 Definite dislike / … / 5 Definite like); 33 = child age preference (1 Older than toddler/2 Toddlers/3 Infants); 34 = career versus childcare (1 No children, full-time career / … / 4 Children, no career); 35 = more satisfaction in life: women or men (1 Men/2 Uncertain/3 Women); 36 = start all over: Male or Female (1 Man/2 Uncertain/3 Woman). Cronbach's α=.67 for the DES control sample involving both men and women. The other summary scale was a female-only 12-item scale, AdultSum2, consisting of the 9-item scale above plus the female-only items 09 = physical activity compared to other women (1 Less active/2 Equal/3 More active); 10 = athletic ability compared to other women (1 Very good / … / 5 Very poor); 22/27 = wanting to be or already being married (1 No/2 Uncertain/3 Yes). Cronbach's α=.47 for the DES control females, lower than for AdultSum1 because of the reduced variability within a females-only control sample. Sex differences are reported in the Results section below.

Masculine Gender Identity Scale in Women (MGI-F)

Part B (9 items) of the MGI-F (Blanchard & Freund, 1983) described earlier focuses on cross-dressing and erotic preferences, with one item for childhood, one for adolescence, and 7 for adulthood. (Example: Question 22. Since the age of 17, did you put on men's underwear or clothing: 2 Once a month or more, for at least a year/1 (Less often, but) several times a year for at least 2 years/0 very seldom or never). Both parts were also combined to a total scale (A & B). The authors reported an internal consistency (Cronbach's α) of .92 for Part B, but no coefficient for A & B. Part B discriminated well (p≤.001) between homosexual and transsexual samples, and A & B even better.

Hobby Preferences Scale (HPS)

The HPS (Lippa, 1991, 1995) is a 60-item checklist of leisure time activities using a 5-point Likert response scale (1 Strongly dislike, 2 Slightly dislike, 3 Neutral, 4 Slightly like, 5 Strongly like). However, instead of Lippa's “gender diagnosticity” scaling, we used a simple quantification derived from the classification of the items as male-typical or female-typical based on Lippa's sample, thus obtaining a bipolar preference scale (Hobby Sum) with moderate internal consistency (Cronbach's α=.76).

Career Questionnaire (CareerQ)

The CareerQ (Berenbaum, 1999) is a 70-item written self-report questionnaire, basically a preference checklist of sex-typed occupations derived from census data regarding the proportion of women and men in each occupation (Example: archeologist: yes/no). It yields two summary scores (Male-typical careers and Female-typical careers), and within our samples moderate internal consistencies (Cronbach's α=.73 and .72, respectively).

Sex-Role Behavior Scale-Revised (SRBS-R)

The SRBS-R (Orlofsky, Ramsden, & Cohen, 1982) is a written self-report questionnaire, which, in its construction and its separate assessment of masculine and feminine qualities, is somewhat comparable to personality trait sex-role questionnaires, but more behaviorally oriented in item formulation. This 240-item self-report inventory yields three overall scales, Male-valued (more typical of men but desirable for both sexes), Female-valued (more typical of women but desirable for both sexes), and Sex-specific (more typical of one sex and desirable only for that sex), as well as 12 analogous area subscales, three each for one of four interest/behavior areas: recreational and leisure activities, vocational preferences, social/dating interaction, and marital behaviors. (Example for a Female-valued marital behavior item: “Being concerned about maintaining one's sexual attractiveness: 1 Much more characteristic of my spouse or partner, 2 Slightly more characteristic of my spouse or partner, 3 Equally characteristic of my spouse or partner and me, or not characteristic of either of us, 4 Slightly more characteristic of me, 5 Much more characteristic of me.” If the person has not encountered the situation, s/he is to rate “how likely the behavior would be for [her/him].”). For all overall scales and area subscales, the authors reported internal consistencies (Cronbach's α) of ≥.70. All scales except Male-Valued Vocational Interests showed highly significant (p≤.001) sex differences, with Cohen's d for the three overall scales of 1.26 (M-valued), 1.81 (F-valued), and 4.36 Sex-Specific (MF).

Data analysis

The study groups were compared on all gender-related variables, first by overall standard parametric procedures (ANOVA), and then, for exploratory purposes, by pairwise independent t-test. If, for a given comparison, any of the demographic variables (ethnicity, mean parental education [as an index of socioeconomic status], and age) significantly differed between the study groups and was correlated with the gender-related variables in the control group, the potential confounding influence of the demographic variable was controlled for by including it in a regression analysis. If study groups differed significantly in variability of the gender-related variable on Levene's test for the equality of variances, weighted least square (WLS) regressions were performed for analyses requiring demographic controls; otherwise, t-tests with equal variances not assumed. This was necessary in only a small number of comparisons. All statistical analyses were conducted using SPSS for Windows Release 13.0.1 (December 12, 2004). In view of the modest sample sizes of women with the rare syndrome of CAH, both conventionally (p < .05) and marginally (p < .10) significant results were listed. Effect sizes (Cohen's d) were calculated using pooled variances for differences between control women and men from the preceding DES project, but using the variance of the sibling controls for differences between CAH groups and controls, because of the frequently increased gender-scale variance of CAH samples.

Results

Sample demographics

Table 2 shows the demographic data for the four study groups.1 The SW women were significantly younger (by 5–6 years) than the other groups of women, the parents of the SV women were lower in educational level than the parents of NC and COS women, and the educational level of the SW women was lower than that of the NC and COS women. These differences in demographics were taken into account in the data analysis described above.
Table 2

Demographics

     

p (statistical tests)

 

SW

SV

NC

COS

4

SW vs

SV vs

NC vs

Variable

n (%)

n (%)

n (%)

n (%)

GPS

SV

NC

COS

NC

COS

COS

Race/ethnicity

40

21

82

24

ns

ns

.066

ns

ns

ns

ns

 White

32 (80)

19 (90)

76 (93)

22 (92)

       

 Hispanic

5 (12)

1 (5)

6 (7)

2 (8)

       

 African-American

3 (8)

1 (5)

0 (0)

0 (0)

       

Age (in years) M

28.78

33.88

33.53

34.73

.017

.026

.005

.010

ns

ns

ns

SD

8.21

8.50

8.84

9.35

       

 Range

18–51

18–51

18–61

19–51

       

Hollingshead education

           

 Father's M

5.22

4.50

5.64

5.71

.048

ns

ns

ns

.007

.038

ns

SD

1.71

1.63

1.48

1.81

       

 Mother's M

5.22

4.44

5.59

5.22

.050

.090

ns

ns

.004

ns

ns

SD

1.57

1.36

1.41

1.88

       

 Subject's M

5.16

5.43

5.86

6.04

.030

ns

.005

.004

ns

ns

ns

SD

1.20

1.72

1.29

1.04

       

Note. Statistical tests for ethnicity are chi-square (White vs. Other), for the other variables ANOVAs (4-group comparisons [4 GPS]) and t-tests.

Gender-related variables recalled from childhood

Table 3 shows the retrospective childhood data (see footnote 1). Of the 21 variables listed, 17 had highly significant (p≤.009) 4-group ANOVAs, and only 4 were not significant. The patterns of results were as expected, a few inconsistencies notwithstanding. On average, the SW women were the most masculine/least feminine, with SV and NC women intermediate, and the control women at the other end. This pattern was also reflected in the number of significant results of paired comparisons. There were (conventionally) significant differences between SW and SV women on 13 variables, between SW and NC on 20, and between SW and COS on 20 (plus one marginally significant difference), between SV and NC on 4 (plus two marginal ones), between SV and COS on 9 (plus 5 marginal ones), and between NC and COS on 2 (plus 2 marginal ones).
Table 3

Group comparisons on gender-related variables recalled from childhood

       

p (statistical tests)

       

4

SW vs

SV vs

NC vs

Variable

High scores

 

SW

SV

NC

COS

GPS

SV

NC

COS

NC

COS

COS

CPAQ

 Subscales (5–8 yrs)

  Sports

Male

M

1.68

0.90

0.60

0.42

.000

.022

.000

.000

ns

.061

ns

  

SD

1.05

1.04

0.83

0.50

       

  Nonathletic play

Male

M

4.98

4.81

4.27

4.17

ns

ns

.048

.059

ns

ns

ns

  

SD

1.70

1.60

1.88

1.49

       

  Feminine play

Female

M

3.95

6.14

6.96

7.29

.000

.004

.000

.000

ns

.070

ns

  

SD

2.94

2.26

1.98

1.88

       

  Activities

Female

M

4.38

4.71

5.05

5.25

ns

ns

.021

.007

ns

ns

ns

  

SD

1.60

1.38

1.17

0.90

       

 Subscales (9–13 yrs)

  Sports

Male

M

3.56

1.65

1.30

0.96

.000

.000

.000

.000

ns

ns

ns

  

SD

1.60

1.90

1.54

1.20

       

  Nonathletic play

Male

M

3.41

2.85

3.33

2.29

ns

ns

ns

.020

ns

ns

.015

  

SD

2.24

1.23

1.89

1.46

       

  Feminine play

Female

M

1.95

3.60

4.65

4.92

.000

.005

.000

.000

.054

.046

ns

  

SD

1.90

2.30

2.13

1.95

       

  Social

Female

M

4.23

5.40

5.69

5.75

.000

.019

.000

.000

ns

ns

ns

  

SD

1.58

2.09

1.64

1.98

       

Summary scales

 M SUM

Male

M

13.62

10.20

9.52

7.83

.000

.011

.000

.000

ns

.063

ns

  

SD

4.65

4.79

4.84

3.41

       

 F SUM

Female

M

14.44

19.75

22.43

23.21

.000

.003

.000

.000

.037

.035

ns

  

SD

6.26

5.95

4.86

4.14

       

Play-related items

 2. Afraid of physical injurya

Female

M

1.38

1.55

1.67

1.83

.081

ns

.023

.006

ns

ns

ns

  

SD

0.54

0.69

0.80

0.70

       

 4. Played with girls (<12 yrs)b

Female

M

2.59

3.25

3.41

3.71

.000

.007

.000

.000

ns

.051

.017

  

SD

0.82

0.91

0.65

0.46

       

 5. Played with boys (<12 yrs)b

Male

M

2.95

3.00

2.43

2.38

.000

ns

.000

.002

.004

.015

ns

  

SD

0.65

0.86

0.77

0.77

       

 7. Played competitive group gamesc

Male

M

2.90

2.85

2.33

2.21

.000

ns

.001

.001

.007

.009

ns

  

SD

0.94

0.88

0.72

0.59

       

 8. Played baseballc

Male

M

2.26

2.05

1.70

1.50

.000

ns

.000

.000

ns

.042

ns

  

SD

0.78

1.05

0.81

0.51

       

RCGQ-R

 Gender role

Female

M

2.72

3.79

4.02

4.26

.000

.000

.000

.000

ns

.033

.088

  

SD

0.89

0.78

0.61

0.65

       

 Physical activity

Female

M

2.52

2.78

3.10

3.21

.009

ns

.001

.002

ns

.032

ns

  

SD

0.71

0.87

0.81

0.82

       

 Cross-gender desire

Female

M

4.37

4.82

4.77

4.72

.002

.003

.004

.044

ns

ns

ns

  

SD

0.61

0.34

0.42

0.56

       

MGI-F

 Part A

Male

M

13.14

7.25

6.15

4.35

.000

.000

.000

.000

ns

.038

.087

  

SD

6.38

4.83

4.42

3.91

       

GRAS-A

 ChildSum1

Female

M

2.41

3.24

3.51

3.59

.000

.000

.000

.000

.081

.082

ns

  

SD

0.65

0.64

0.63

0.67

       

 ChildSum2

Female

M

2.51

3.27

3.64

3.77

.000

.001

.000

.000

.040

.030

ns

  

SD

0.82

0.71

0.72

0.76

       

Note. Statistical tests are ANOVAs (4-group comparisons [4 GPS]) and t-tests or WLS regressions.

aResponse scale: 1 Seldom /…/ 4 Always.

bResponse scale: 1 Never /…/ 4 Always.

cResponse scale: 1 Never /…/ 4 Very Often.

Table 4 shows the respective effect sizes for the differences between CAH and control women along with the effect size for sex differences from the earlier DES study, where applicable, and with the internal consistency coefficients (Cronbach's α) of the scales in the current total sample. The effect size pattern mirrored the earlier findings. The (absolute) effect sizes ranged from 0.54 to 2.52 (median, 1.73) for the SW women, from 0.18 to 0.96 (median, 0.60) for the SV women, and from 0.07 to 0.71 (median, 0.20) for the NC women.
Table 4

Effect sizes for group comparisons on gender-related variables recalled from childhood

Variable

High scores

αa

M vs. Fb

SW vs. COS

SV vs. COS

NC vs. COS

CPAQ

      

 Subscales (5–8 years)

      

  Sports

Male

.59

−1.55

−2.52

−0.96

−0.36

  Nonathletic play

Male

.58

−0.90

−0.54

−0.43

−0.07

  Feminine play

Female

.79

1.92

1.78

0.61

0.18

  Activities

Female

.60

0.46

0.97

0.60

0.22

 Subscales (9–13 years)

      

  Sports

Male

.72

−1.94

−2.17

−0.58

−0.28

  Nonathletic play

Male

.64

−1.35

−0.77

−0.38

−0.71

  Feminine play

Female

.70

2.35

1.52

0.68

0.14

  Social

Female

.67

0.91

0.77

0.18

0.30

Summary scales

      

 M SUM

Male

.83

−2.00

−1.70

−0.70

−0.50

 F SUM

Female

.87

2.03

2.12

0.84

0.19

RCGQ-R

      

 Gender role

Female

.91

2.37

0.72

0.37

 Physical activity

Female

.64

0.84

0.52

0.13

 Cross-gender desire

Female

.47

0.62

−0.18

−0.09

MGI-F

      

 Part A

Male

.86

−2.25

−0.74

−0.46

GRAS-A

      

 ChildSum1

Female

.77

2.95

1.76

0.52

0.12

 ChildSum2

Female

.83

NA

1.66

0.66

0.17

Note. NA: Not applicable.

aCronbach's α based on the total sample of CAH and control women from the current study.

bBased on 67 female and 60 male non-DES controls from the preceding study on the effects of prenatal DES exposure on adult behavior.

Gender-related variables from adulthood

Table 5 shows the adulthood data (see footnote 1). Of the 22 variables listed, 18 had (conventionally) significant 4-group ANOVAs, and only 4 were not significant. Again, the patterns of results were as expected, a few inconsistencies notwithstanding, with the SW women as a group at the high-masculine/low-feminine pole and the control women at the other pole. Again, this pattern was reflected in the number of significant results of paired comparisons. There were (conventionally) significant differences between SW and SV women on 14 variables (plus 2 marginal ones), between SW and NC on 19, and between SW and COS on 17 (plus two marginal ones), between SV and NC on 2 (plus four marginal ones), between SV and COS on 3 (plus 2 marginal ones), and between NC and COS on 1 (plus 2 marginal ones).
Table 5

Group comparisons on gender-related variables from adulthood

       

p (statistical tests)

       

4

SW vs

SV vs

NC vs

Variable

High scores

 

SW

SV

NC

COS

GPS

SV

NC

COS

NC

COS

COS

GRAS-A

 AdultSum1

Female

M

3.45

3.98

3.94

4.08

.000

.000

.000

.000

ns

ns

ns

  

SD

0.68

0.41

0.40

0.58

       

 AdultSum2

Female

M

3.35

3.84

3.82

3.94

.000

.000

.000

.000

ns

ns

ns

  

SD

0.68

0.33

0.39

0.47

       

MGI-F

 Part B

Male

M

2.62

2.00

1.09

1.41

.000

ns

.000

.036

.048

ns

ns

  

SD

1.99

1.87

1.43

1.59

       

 A & B

Male

M

15.59

9.25

7.24

5.80

.000

.000

.000

.000

.095

.037

ns

  

SD

7.92

5.05

4.80

5.32

       

HPS

 Hobby sum

Female

M

3.06

3.54

3.68

3.85

.000

.000

.000

.000

ns

.009

.035

  

SD

0.36

0.37

0.33

0.38

       

CareerQ

 Male-typical

Male

M

7.07

6.42

5.03

4.54

.008

ns

.006

.006

.087

.062

ns

  

SD

3.86

3.73

2.99

2.19

       

 Female-typical

Female

M

4.11

6.68

5.47

6.63

.006

.006

.037

.004

.093

ns

.092

  

SD

2.98

2.91

2.84

3.03

       

SRBS-2

 Male-valued

Male

M

2.73

2.82

2.70

2.72

ns

ns

ns

ns

ns

ns

ns

  

SD

0.34

0.29

0.33

0.26

       

 Female-valued

Female

M

2.97

3.45

3.39

3.45

.000

.000

.000

.000

ns

ns

ns

  

SD

0.33

0.31

0.30

0.35

       

 Sex-specific (MF)

Male

M

2.77

2.37

2.36

2.27

.000

.000

.000

.000

ns

ns

ns

  

SD

0.41

0.26

0.21

0.31

       

 M-Recreational

Male

M

2.67

2.56

2.50

2.42

ns

ns

ns

ns

ns

ns

ns

  

SD

0.63

0.63

0.65

0.58

       

 M-Vocational

Male

M

1.93

2.24

2.20

2.23

ns

.053

.039

.058

ns

ns

ns

  

SD

0.63

0.49

0.69

0.56

       

 M-Social

Male

M

3.09

3.35

3.21

3.34

ns

.096

ns

ns

ns

ns

ns

  

SD

0.56

0.58

0.66

0.73

       

 M-Marital

Male

M

2.98

2.98

2.82

2.81

.034

ns

.012

.085

.040

ns

ns

  

SD

0.36

0.40

0.30

0.37

       

 F-Recreational

Female

M

2.61

3.07

3.16

3.07

.000

.001

.000

.002

ns

ns

ns

  

SD

0.54

0.44

0.52

0.54

       

 F-Vocational

Female

M

1.95

2.64

2.64

2.78

.000

.000

.000

.000

ns

.072

ns

  

SD

0.55

0.66

0.56

0.59

       

 F-Social

Female

M

3.21

3.75

3.69

3.76

.000

.000

.000

.000

ns

ns

ns

  

SD

0.39

0.44

0.57

0.45

       

 F-Marital

Female

M

3.27

3.66

3.54

3.58

.000

.000

.000

.005

ns

ns

ns

  

SD

0.41

0.33

0.32

0.37

       

 MF-Recreational

Male

M

3.04

2.50

2.47

2.36

.000

.000

.000

.000

ns

ns

ns

  

SD

0.38

0.47

0.36

0.44

       

 MF-Vocational

Male

M

2.94

2.56

2.49

2.30

.000

.005

.000

.000

.085

.000

.052

  

SD

0.55

0.32

0.28

0.43

       

 MF-Social

Male

M

2.60

2.52

2.33

2.30

.012

ns

.001

.009

ns

ns

ns

  

SD

0.36

0.62

0.43

0.49

       

 MF-Marital

Male

M

2.59

2.07

2.21

2.19

.000

.000

.000

.001

ns

ns

ns

  

SD

0.55

0.37

0.35

0.34

       

Note. Statistical tests are ANOVAs (4-group comparisons [4 GPS]) and t-tests or WLS regressions.

Table 6 shows the respective effect sizes for syndrome-control group differences, again along with the effect sizes for sex differences from our earlier DES study and the scales' Cronbach's α coefficients in the current total sample. Also here, the effect size pattern mirrored the earlier findings. The (absolute) effect sizes ranged from 0.04 to 2.08 (median, 1.12) for the SW women, from 0.00 to 0.86 (median, 0.24) for the SV women, and from 0.03 to 0.45 (median, 0.16) for the NC women. (In this calculation, the effect sizes of the few differences against expectations were added to the low end.)

Scale intercorrelations

As we selected instruments with relatively global summary scales, the question arises how closely such measures are related. Table 7 presents the intercorrelations among the (retrospective) childhood measures. The 28 coefficients ranged (in absolute size) from .06 to .94, with a median of .51. (The direction of the intercorrelations depended on the definitions of the high scores of individual scales as male or female.) The upper limit was provided by the correlations between ChildSum1 and ChildSum2 of the GRAS, which is not surprising given the overlap in items on the two scales. The lowest value indicated the independent variation of (recalled) engagements in male-typical and female-typical childhood play and sports activities. The most comprehensive scales in terms of domain coverage (Gender Role of the RCGQ-R, Part A of the MGI-F, and the ChildSum scales of the GRAS-A) showed (absolute) intercorrelations between .75 and .85, i.e., high enough to warrant the selection of only one of them for a given study. The (absolute) intercorrelations of the two Sum scales of the CPAQ with the three global gender scales ranged from .45 to .62, i.e., were less well suited as global measures and quite comparable in that regard to the remaining scales.

Table 8 shows the 45 intercorrelations among the summary scales for adulthood. They ranged (in absolute size) from .02 to .94, with a median of .35, i.e., clearly lower than for the child scales (p≤.001, Fisher's Exact Test applied to Median test). The upper limit was again provided by two scales from the GRAS-A with considerable overlap in items. The next highest intercorrelation, .69, was between the A & B scale of the MGI-F and the Sex-Specific (MF) scale of the SRBS-2, but note that the A & B scale shared the majority of items with the Part A scale, which was primarily focused on recalled childhood behavior. Thus, the adulthood scales do not include a cluster of highly intercorrelated global gender scales; each of the summary scales appears to have considerable unshared variance.
Table 6

Effect sizes for group comparisons on gender-related variables from adulthood

Variable

High scores

αa

M vs. Fb

SW vs. COS

SV vs. COS

NC vs. COS

GRAS-A

 AdultSum1

Female

.47

2.42

1.09

0.17

0.24

 AdultSum2

Female

.61

NA

1.26

0.21

0.26

MGI-F

 Part B

Male

.51

−0.76

−0.37

0.20

 A & B

Male

.86

−1.84

−0.65

−0.27

HPS

 Hobby Sum

Female

.76

2.08

0.82

0.45

CareerQ

 Male-typical

Male

.73

−1.16

−0.86

−0.22

 Female-typical

Female

.72

0.83

−0.02

0.38

SRBS-2

 Male-valued

Male

.85

−1.70

−0.04

−0.38

0.08

 Female-valued

Female

.89

2.03

1.37

0.00

0.17

 Sex-specific (MF)

Male

.88

−4.41

−1.61

−0.32

−0.29

 M-Recreational activities

Male

.77

−1.01

−0.43

−0.24

−0.14

 M-Vocational interests

Male

.79

−0.20

0.54

−0.02

0.05

 M-Social/dating behavior

Male

.74

−0.70

0.34

−0.01

0.18

 M-Marital behavior

Male

.79

−2.11

−0.46

−0.46

−0.03

 F-Recreational activities

Female

.55

0.84

0.85

0.00

−0.17

 F-Vocational interests

Female

.75

0.80

1.41

0.24

0.24

 F-Social/dating behavior

Female

.77

1.24

1.22

0.02

0.16

 F-Marital behavior

Female

.85

1.64

0.84

−0.22

0.11

 MF-Recreational activities

Male

.68

−3.03

−1.55

−0.32

−0.25

 MF-Vocational interests

Male

.75

−1.81

−1.49

−0.60

−0.44

 MF-Social/dating behavior

Male

.61

−2.11

−0.61

−0.45

−0.06

 MF-Marital behavior

Male

.82

−4.56

−1.18

0.35

−0.06

Note. NA: Not applicable

aCronbach's α based on the total sample of CAH and control women from the current study.

bBased on 67 female and 60 male non-DES controls from the preceding study on the effects of prenatal DES exposure on adult behavior.

Table 7

Intercorrelations (r) of major gender scales: Childhood

 

CPAQ

RCGQ-R

MGI-F

GRAS-C

 

F SUM

Gender role

Physical activity

Cross-gender desire

Part A

Child Sum1

Child Sum2

CPAQ

 M SUM

−.06

−.54***

−.57***

−.35***

.54***

−.61***

−.62***

 F SUM

 

.60***

.18*

.31***

−.52***

.46***

.45***

RCGQ-R

 Gender role

  

.48***

.58***

−.85***

.79***

.84***

 Physical activity

   

.30***

−.48***

.53***

.57***

 Cross-gender desire

    

−.59***

.48***

.51***

MGI-F

 Part A

     

−.75***

−.76***

GRAS-C

       

 ChildSum1

      

.94***

*p≤.05; **p≤.01; ***p≤.001.

Table 8

Intercorrelations (r) of major gender scales: Adulthood

 

GRAS-A

MGI-F

HPS

CareerQ

SRBS-2

 

Adult Sum2

Part B

A & B

Hobby Sum

Male-typical

Female-typical

Male-valued

Female-valued

Sex- specific (MF)

GRAS-A

 AdultSum1

.94***

−.37***

−.56***

.36***

−.11

.36***

.03

.48***

−.54***

 AdultSum2

 

−.36***

−.55***

.32***

−.13

.29***

−.07

.43***

−.55***

MGI-F

 Part B

  

.62***

−.23***

.16

−.06

.20*

−.26***

.42***

 A & B

   

−.59***

.48***

−.25**

.28***

−.38***

.69***

HPS

 Hobby Sum

    

−.32***

.34***

−.25**

.48***

−.61***

CareerQ

 Male-typical

     

.25**

.47***

−.02

.36***

 Female-typical

      

.18*

.48***

−.46***

SRBS-2

 Male-valued

       

.32***

.28***

 Female-valued

        

−.58***

*p≤.05; **p≤.01; ***p≤.001.

Table 9 presents the correlations between childhood and adulthood scales. The 147 coefficients ranged (in absolute value) from .00 to .97, with a median of .39. The highest coefficient comes from the scales A & B and Part A of the MGI-F, which have the majority of items in common. Similarly, the relatively high correlations of A & B with the two other most comprehensive childhood gender scales reflect their correlations with Part A. The majority of the other correlations were also significant, indicating continuity to varying degrees from (recalled) childhood to adulthood gender-related behavior. Adult variables tended to correlate highest with the three most comprehensive childhood scales. Among the adulthood scales, the CareerQ scales, especially Female-typical, and the first two summary scales of the SRBS-2 appear to show relatively lower correlations with childhood scales.
Table 9

Correlations (r) of major gender scales: Adulthood versus childhood

 

CPAQ

RCGQ-R

MGI-F

GRAS-A

 

M SUM

F SUM

Gender role

Physical activity

Cross-gender desire

Part A

Child Sum1

Child Sum2

GRAS-A

 Adult Sum1

−.18*

.36***

.49***

.32***

.39***

−.55***

.41***

.43***

 AdultSum2

−.28***

.31***

.46***

.42***

.34***

−.53***

.44***

.46***

MGI-F

 Part B

.08

−.24**

−.32***

−.16

−.23**

.42***

−.29***

−.30***

 A & B

.48***

−.50***

−.84***

−.45***

−.58***

.97***

−.72***

−.73***

HPS

 Hobby Sum

−.49***

.53***

.70***

.38***

.37***

−.62***

.62***

.62***

CareerQ

 Male-typical

.48***

−.15

−.39***

−.45***

−.20*

.47***

−.53***

−.54***

 Female-typical

.07

.48***

.36***

.00

.16

−.29***

.17*

.18*

SRBS-2

 Male-valued

.46***

.05

−.28***

−.40***

−.18*

.27***

−.26***

−.29***

 Female-valued

−.11

.56***

.40***

.13

.17*

−.38***

.36***

.32***

 Sex-specific (MF)

.26***

−.52***

−.56***

−.32***

−.26**

.68***

−.55***

−.52***

*p≤.05; **p≤.01; ***p≤.001.

Gender identity problems

Three SW patients of the study population reported clinically significant problems with gender identity. One of these women sought advice from us regarding a desired gender change to male but, after counseling, decided not to seek legal gender change and genital surgery; her data are included in the SW group. The other two patients were not included in the data analysis and results described above. One of them had gradually changed in adulthood to living as a male (without formal legalization), and some of the assessment methods were gender-incompatible. The other asked for help in changing to male, but with counseling decided to forego legal gender change and genital surgery at this time. She was not included in the data analysis because significant cognitive limitations interfered with the standard administration of the assessment battery. The first two of these three patients were included with detailed data in a prior report on gender change (Meyer-Bahlburg et al., 1996). None of the other groups included an individual with manifest gender dysphoria or gender change.

The original article of the MGI-F (Blanchard & Freund, 1983) provided cut-off scores for the differentiation of transsexual from non-transsexual individuals. When applied to our analysis sample, 5 women fell into the transsexual range on Part A, none on Part B, and 2 on A & B combined. The one woman mentioned earlier who had asked for advice regarding gender change fell into the transsexual range on both Part A and A & B, but not Part B. (The other two individuals with clinically manifest gender problems were excluded from group analysis, as noted above.)

By recollection, gender identity problems seem to have been more frequent in childhood. On the Cross-Gender Desire scale of the RCGQ-R, the SW women on average were significantly lower, i.e., more frequently cross-gendered, than the other three groups (Table 3). Item analysis showed that this was mainly due to item 17 (“As a child, I had the desire to be a boy”: 1 Almost always / … / 5 Never). “Frequently” (“2”) and “Sometimes” (“3”) was checked by 3 and 6, respectively, of 27 SW, by 0 and 2, respectively, of 19 SW, by 1 and 3, respectively, of 76 NC, and by 0 and 3, respectively, of 23 control women. (The RCGQ-R was included in the protocol after the study was started; therefore, the n was reduced.)

Discussion

This study is the first to evaluate gender outcome in a sizeable sample of NC women in comparison to other variants of CAH and normal controls. In terms of clinical somatic symptoms, NC constitutes the mildest form of CAH, after SW and SV. As NC women do not show genital ambiguity at birth, they are typically excluded from research studies on intersexuality. However, their genotype is present from conception, and it is likely that mild abnormalities of enzyme and adrenal hormone production are also present from the late first trimester of pregnancy on.

What are the implications for behavior? If there are no gender-behavioral effects, NC women should not differ from demographically comparable non-CAH control women, and the effect size for gender-related differences should be near zero. However, where the NC-COS differences in our study were significant, they showed behavioral masculinization/defeminization. Also, the effect sizes centered not around zero, but towards some degree of masculinization/defeminization in NC women. Three explanations should be considered. First, our control group was small and not perfectly comparable in demographics, so that modest shifts in difference patterns might just be due to sampling circumstances. However, the differences in demographics between our NC and COS samples were not only statistically non-significant, but also very small (Table 2). Second, the modest differences in gender behavior between NC and control women were not just a sampling artefact, but valid and due to postnatal sex-hormone effects. In fact, a number of studies of adult (presumably non-CAH) women have shown correlations of androgen levels with occupational activity (Purifoy & Koopmans, 1979; Schindler, 1979), a history of criminal violence (Dabbs, Ruback, Frady, Hopper, & Sgoutas, 1988), or personality traits such as masculinity and femininity (Baucom, Besch, & Callahan, 1985). However, it is unclear whether elevated androgen levels in these adult studies are correlated with preceding prenatal androgen levels. Also, our NC participants were usually on androgen-reducing treatment, which should diminish postnatal androgen effects on gendered behavior, if they indeed exist. Third, the gender differences between NC and control women were, in fact, due to prenatal hormone abnormalities. The prenatal hormone effects are too subtle to detect in genital tissues, but nevertheless mildly affect the sexual differentiation of the brain and thereby later gender-related behavior. The current study can only raise the question, but not answer it.

The second goal of our study was to demonstrate that the findings of behavioral masculinization/defeminization in women with classical CAH hold up across diverse assessment instruments. In our study, gender-related behavior was rated on the basis of interviewer-masked detailed open-ended interviews as well as by the women themselves using written questionnaires with diverse item formats and response scales. The gender-related domains covered ranged from diverse recalled childhood behaviors to various adulthood behaviors and preferences; for one of the instruments, the SRBS-2, the behaviors were grouped according to their gender-specific desirability. Despite some minor inconsistencies, the expected direction of differences, especially of the most severely affected CAH women (SW) in comparison to the others, showed up in almost all variables. Only the results on the Male-valued overall scale of the SRBS-R and some of its area subscales deviated from this pattern of findings, although we re-validated this scale's sex difference with the controls from our preceding DES study. (We do not have a plausible explanation for a lack of a CAH finding on the Male-valued scale at this time.)

Given the range of gender-related domains covered in our instruments, and given the fact that the expected direction of differences between and among CAH women showed up not only in global scales, but also in subscales and individual items, our study replicates previous findings of what we consider broad-band effects. Most domains of gender-related behavior appear to be affected by the atypical prenatal sex-hormonal milieu of 46,XX CAH individuals, albeit to varying degrees. Therefore, behavioral masculinization/defeminization constitutes a very robust outcome of the 46,XX CAH condition.

We also replicated previous findings that the SW variant as the most severe and most strongly androgenized form of CAH shows the most marked behavioral masculinization/ defeminization, and that SV women were clearly less behaviorally affected. Many of the effect sizes for the SW-control group differences were very large when compared to effect sizes typical of clinical behavioral science research. On some measures, they even rivaled the size of the sex difference. In this context, however, we need to keep in mind that the published effect sizes for sex differences on our instruments as well as our previous DES project are based on convenience samples, that our CAH samples were not randomly drawn from strictly comparable populations, and that our sample sizes were small, so that we cannot be sure that this finding is fully valid.

Overall, the findings of a correlation of degree of CAH severity and, by implication, degree of androgenization, with behavioral masculinization/defeminization suggests a crude dose-response relationship between prenatal androgens and later gender-differentiated behavior. As also the women with classical CAH in this study were on gluococorticoid replacement therapy which controls the excess adrenal androgens, a contribution of concurrent excess androgen levels to behavioral outcome in adulthood is not very likely.

The assessment instruments in our study were deliberately selected for providing relatively global summary scales, which we prefer for clinical purposes. They are not well suited as measures of discrete behaviors or even traits that could be meaningfully compared to those behavioral units on which animal researchers focus when they investigate hormone-, dose-, and timing-specific hormone-behavior relationships. Thus, we will not attempt to characterize the scales used in this study as specific indicators of masculinization or defeminization.

Despite the marked behavioral masculinization/de-feminization of women with classical CAH, all but three women were female identified. Nevertheless, three individuals out of 42 women with SW-CAH, or out of 63 women with classical CAH, or even out of 145 females with any form of CAH, is a very high rate of gender change when compared to the highest reported community prevalence of female transsexuals, approximately one in 30,000 in the Netherlands (Bakker, van Kesteren, Gooren, & Bezemer, 1993). Thus, there is an increased, albeit modest, risk of gender dysphoria and gender change in women with classical CAH, as it has recently also been demonstrated in a review of the world literature (Dessens, Slijper, & Drop, 2005). That DSD patients remember increased gender identity problems in earlier years has also been observed in other DSD samples (e.g., Meyer-Bahlburg et al., 2004; Wisniewski et al., 2004). This phenomenon should be more closely examined in longitudinal studies because of its theoretical and clinical implications for the development of gender dysphoria and gender change.

In regard to our third goal, namely the evaluation of instrument utility, almost all of the scales employed differentiated at least between the most severely affected CAH group and controls, and many picked up additional differences. For the childhood scales, four stand out as being particularly comprehensive in coverage, showing good effect sizes, and high intercorrelations: the Gender Role scale of the RCGQ-R, Part A of the MGI-F, and the two ChildSum scales of the GRAS-A. Of these four, the RCGQ-R has the lowest item number and the most homogeneous item format, without loss in sensitivity. Very good effect sizes were also shown by the two summary scales of the CPAQ and some of the subscales, but they were narrower in domain coverage. The adulthood scales tended to differentiate less well between CAH groups and controls than the childhood scales. In our sample, the Hobby Sum scale of the HPS had the largest effect size for differences between SW and control women, perhaps indicating that biologically influenced aspects of gender are more freely expressed in leisure time activities, which may be comparable to childhood play, than other domains. The intercorrelations among the adulthood summary scales suggest more specificity than is typical of the childhood scales. It is, therefore, difficult to clearly make general recommendations for adulthood scale utility on the basis of our data.

Our study shared the limitations typical of adult follow-up studies of persons with rare DSD syndromes in general and women with CAH in particular (Meyer-Bahlburg & Blizzard, 2004): relatively small sample sizes (although our study had larger CAH-subtype samples than most), high rates of nonparticipation due to a variety of factors and therefore questionable representativeness, and a cross-sectional design rather than a long-term prospective follow-up investigation with its formidable logistic barriers. Nevertheless, our findings were in line with both hypothesized expectations and pertinent findings of others so that a major distorting effect of sample attrition does not appear very plausible.

We conclude that behavioral masculinization/de-feminization is pronounced in SW-CAH women, slight but still clearly demonstrable in SV women, and suggested, but questionable in NC women. Gender dysphoria and patient-initiated gender change among women with classical CAH are rare, but more prevalent than in the general population. In the area of gender assessment, further improvements are needed.

Footnotes
1

Full statistical details on ANOVA and t-test results are available from the corresponding author on request.

 

Acknowledgments

The project described was supported in part by USPHS Grants HD-38409, RR06020 (GCRC), and by Grant Number U54 RR01-9484 from the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH). Its contents are solely the responsibilities of the authors and do not necessarily represent the official views of NCRR or NIH. We thank the study women for their participation in this research and Drs. Sheri Berenbaum and Richard Lippa for making their respective questionnaires available to us. Ms. Rhoda Gruen served as interviewer trainer. Ms. Patricia Connolly assisted in word processing.

Copyright information

© Springer Science+Business Media, Inc. 2006