Introduction

Autobiographical narratives are how we make sense of our selves and our world (Fivush, Habermas, Waters, & Zaman, 2011; McAdams, 2001; McLean, Pasupathi, & Pals, 2007). How we choose to express our memories, both in reflection to ourselves and in conversations with others, helps shape our understanding of what these experiences are and what they mean for interpreting and evaluating who we are in the world and in relation to other people. Thus, it is especially provocative that there are reliable gender differences in autobiographical narratives (see Grysman & Hudson, 2013, for a review), such that females tell more detailed, more elaborated, more emotionally expressive, and more relationally oriented personal narratives than do males. Equally importantly, we do not see gender differences in other forms of narratives, such as fictional stories (Szaflarski et al., 2012; Weiss, Kemmler, Deisenhammer, Fleischhacker, & Delazer, 2003) or narrative descriptions of witnessed events that are not personally significant (Jack, Leov, & Zajac, 2014). These patterns suggest that the observed gender differences are not related to narrative skills per se, but are rooted in gender differences in narrating important aspects of identity.

Yet, findings of gender differences are inconsistent in the autobiographical memory literature. As Grysman and Hudson (2013) argued in their review, there are many reasons for these inconsistent findings, some theoretical and some methodological, and it is critical that research examine these possibilities more directly in order to more fully understand the extent and meaning of the gender differences in autobiographical memory. Thus, on the basis of Grysman and Hudson’s review, the goal of this study was to provide an in-depth analysis of gender differences in autobiographical memory while accounting for four possible reasons for inconsistencies in the existing literature: a focus on categorical gender rather than gender typicality, the event types elicited, how the narratives are assessed, and the populations studied. Doing so entailed a theoretically motivated analysis of autobiographical memory narratives, with gender’s role therein as its primary concern. We discuss each of these in turn.

Categorical sex/gender versus gender typicality

Grysman and Hudson (2013) hypothesized that one of the theoretical reasons we see inconsistent findings in gendered autobiography is due to the focus on biological sex (obtained via self-report as male or female), rather than on identifying with characteristics that are typical of men and women. In particular, although most members of a culture can easily describe gender-typical behavior, gender identity rests on the extent to which individuals feel that they themselves are gender-typical and value this gender typicality (Tobin et al., 2010). In industrialized Western cultures, females are stereotypically described as warm, nurturing, caring, and emotional, yielding a general relationally oriented style, whereas males are described as agentic, assertive, and independent, yielding an overall autonomous orientation (Gilligan, 1982; Spence & Helmreich, 1978), and this stereotype persists today (Löckenhoff et al., 2014). Given the similarities in gender-typical behaviors and the observed dimensions of gender differences in autobiographical narratives focusing on affective and relational dimensions (e.g., Davis, 1999; McAdams et al., 2006), it is possible that individual differences in personal identification with gender-typical patterns account for the observed gender differences in autobiographical memory as much as gender as a categorical variable. We addressed this possibility in this study by explicitly assessing individual gender typicality with regard to the variables of agency and communion (henceforth referred to as gender typicality) and examined the relations among gender, gender typicality, and autobiographical narratives. That is, we examined whether personal identification with gender-typical characteristics, specifically assessing stereotypes of communion and agency (Bem, 1981; Spence, Helmreich, & Stapp, 1975), drives (i.e., mediates) the gender effects commonly reported in the autobiographical memory literature. In other words, it is possible that a greater personal value placed on communion leads a person to more processing of memories that involve others or are geared toward building interpersonal bonds (i.e., the social function of autobiographical memory; Bluck, Alea, Habermas, & Rubin, 2005), and therefore increases memory elaboration, and hence recall.

Event type in memory elicitation

Studies of autobiographical memory vary considerably in their methods of memory elicitation. For example, cue-word methods lead to comparisons of reaction times for retrieval or of numbers of memories recalled (e.g., Davis, 1999; Robinson, 1976; Ros & Latorre, 2010). Within narrative-based analyses, researchers can ask for any event that occurred within a specified time period (e.g., Grysman, Prabhakar, Anglin, & Hudson, 2013; Kanten & Teigen, 2008), which has the advantage of imposing limited bias on the memory search process, but may elicit mundane memories rather than personally significant events. Other studies explicitly solicit a specific type of event, often a highly emotional event (a high or low point: e.g., Grysman & Hudson, 2010; or a traumatic memory: e.g., Sales, Fivush, Parker, & Bahrick, 2005) or a self-defining memory (Singer & Salovey, 1993; see, e.g., Liao, Bluck, & Cheng, 2015). Thus in this study, we solicited narratives (1) through an open-ended time frame using a nonemotional cue for recall; (2) through both high- and low-point events, to explicitly elicit emotions but not explicitly prime identity; and (3) through a self-defining memory, to explicitly elicit narratives that connect events to identity.

Studies that solicit personally significant events often either solicit one type of event or combine across multiple types of personally significant events. Theoretically, we might predict greater gender differences for some types of personally significant events than others. More specifically, given that females generally express more emotion than do males across multiple contexts (Basow & Rubenfeld, 2003; Bischoping, 1993; Leaper & Ayres, 2007; Newman, Groom, Handelman, & Pennebaker, 2008), it may be that highly emotional events elicit greater gender differences in recall than do less emotional events. Similarly, events that are self-defining may be especially likely to solicit explicit gendered ways of recalling, especially for individuals who define themselves as highly gender-typical. Conversely, highly emotional events may elicit similarities across genders, whereas moderately emotional or open-ended events may elicit more spontaneous use of emotion among women than among men. Thus, with a neutral recall cue, we can examine gender differences in the spontaneous use of emotion and the connection to identity that may emerge. By explicitly analyzing a wider variety of events than in previous research, we can begin to consider whether gender differences are more apparent for some types of autobiographical events than for others, and thus better understand the extent to which event type may explain inconsistencies in the literature. Given the lack of clear comparisons across events in previous research, this analysis remains exploratory.

Assessing autobiographical memory narratives

Varied aspects of narratives have been examined in previous research, including coherence, elaborated detail, and affective expression, among others. Few studies provide an extensive scan of multiple narrative dimensions in order to more fully examine where gender differences do and do not emerge (but see Fivush, Bohanek, Zaman, & Grapin, 2012, for such a study with adolescents). Thus, in this study we assessed each of the four narratives collected on the dimensions that have been most central in the autobiographical memory literature, including coherence, elaboration, affect, agency, and connectedness (see Fivush et al., 2012, for a full theoretical discussion of these dimensions). We chose these dimensions both because they are critical to being able to tell a clear, comprehensible narrative, and because they reflect different orientations to what might be important to communicate about one’s past experiences.

Coherence

As was reviewed by Reese et al. (2011), coherence is a central component of narrative—the ability to tell a story with a comprehensible chronological structure that cohesively links one action to another. This kind of thematic coherence is a basic cognitive skill, developing across childhood, and achieving adult-like levels by late adolescence (Reese et al., 2011). Because this is a basic cognitive skill brought to bear on constructing a personal story, we did not expect to find any gender differences in this narrative dimension.

Factual and interpretive elaboration

Elaboration captures how detailed and vivid the narrative is in the telling. Mother–child reminiscence research (for a review, see Fivush, Haden, & Reese, 2006) emphasizes the developmental role of elaborations in fostering theory of mind, subjective perspective, and a differentiated sense of self in time. However, several studies have shown that females also provide more detailed and vivid memory narratives than do males (see Grysman & Hudson, 2013). Fivush et al. (2006) advocated for a more thorough understanding of elaborations as they are used in narrative. To accomplish this, we conceptualized elaboration along two orthogonal dimensions, based on Bruner’s (1990) distinction between the landscape of actions and the landscape of consciousness. Factual elaborations assessed the richness of detail about the people, actions, and physical backdrop of the event, whereas interpretative elaborations captured rich detail about one’s evaluations and interpretations of the event. In this vein, Pasupathi and Wainryb (2010) found that, with increasing age across adolescence, females tell more factual and interpretive narratives than do males (see also Grysman & Denney, 2016; Grysman & Hudson, 2011). We thus examined whether gender differences in terms of both factual and interpretive elaboration would remain in an adult sample, especially as gender differences are more commonly found in children than in adults (Buckner, 2000; Grysman & Hudson, 2013). We predicted that females would include more factual and interpretative elaborations than males. Because it is not clear whether and how this aspect of narrative might be related to gender typicality, we made no predictions.

Affect

Robust findings indicate that women generally express more affect (see Leaper & Ayres, 2007, and Newman et al., 2008, for reviews), and that they express more affect in autobiographical recall (see, e.g., Bauer, Stennes, & Haight, 2003; Boals, 2010; Buckner & Fivush, 1998; Davis, 1999; Dunn, Bretherton, & Munn, 1987; Rice & Pasupathi, 2010), than men. Moreover, emotion expression is one of the most strongly held stereotypes about gender (Löckenhoff et al., 2014). Thus, we predicted that females and those higher in feminine gender typicality would express more affect in their narratives than would males and those lower in feminine gender typicality.

Agency and connectedness

Bakan (1966) described agency and connectedness as basic gender-differentiated orientations of personality, with males displaying more agency and females displaying more connectedness (Gilligan, 1982; Spence et al., 1975). On the basis of McAdams’s life story coding system (e.g., Mansfield & McAdams, 1996; McAdams et al., 2006), agency is defined as the expression of autonomy and self-efficacy, a sense of controlling one’s life. Connectedness is a sense of relatedness and valuing others. Although connectedness with others is clearly feminine-stereotyped and agency is masculine-stereotyped (Gergen, 2001; Gilligan, 1982; Leaper & Ayres, 2007), past research has been quite mixed on whether gender differences emerged in these orientations. Thus, we were not sure whether gender differences in these themes would emerge, but we did predict that gender typicality would be related to these themes, especially because the measure of gender typicality was also structured with regard to agency and connectedness.

Participant age

A final limitation in the existing literature is the reliance on college student samples. College students, though increasingly diverse, represent a relatively thin slice of the developmental age span of adulthood. Indeed, many researchers argue that college students are part of a unique developmental group known as emerging adults (Arnett, 2000), because they display unique psychological and social identity profiles. More specifically, emerging adults, identified as ages 18–29 (Arnett, 2000), are commonly focused on individuated and professional goals (Kroger, 2003), and thus gender differences may be minimal in this group. In contrast, although variations in individual people’s life trajectories exist, young adults (defined here as ages 30–40) move into a life period defined by creating families with intimate partners and children, and the individuals in this age group commonly struggle with issues of parenting and family–work balance (Craig & Mullan, 2010; Katz-Wise, Priess, & Hyde, 2010). These pressures highlight gendered roles within the family in ways that may lead to more strongly stereotyped gender typicality and greater gender differences in autobiographical recall among this age group than among emerging adults. Thus, we chose these two age groups so as to compare two distinct developmental periods that reflect these different developmental challenges (Erikson, 1968; Kroger, 2003).

The present study

Thus, the overall objective of this study was to examine possible gender differences in autobiographical memory narratives across multiple event types, coded along multiple dimensions, across a wide developmental age span, and to examine the roles of both gender and gender typicality in these differences. Overall, we predicted that females would narrate more elaborated, affective, connected narratives than males, especially for highly emotional as compared to unspecified events, and that this difference would be greater for individuals who reported being highly feminine gender-typical, and possibly also for young as compared to emerging adults. In contrast, males and those high in masculine gender typicality would narrate more agentic narratives than either females or those low in masculine gender typicality, and this difference might be greater for young adults than for emerging adults.

Method

Participants

Participants were recruited via the Internet using Amazon’s Mechanical Turk, and were limited to the United States. We collected data from 196 participants (98 women, 98 men), ages 18–40 years. The mean reported ages were 29.05 (SD = 6.25) for women and 29.04 (SD = 6.01) for men. The participants were specifically recruited to be evenly split between emerging adults (ages 18–29, M = 23.83, SD = 3.18; 49 men, 49 women) and early adults (ages 30–39, M = 34.27, SD = 3.15; 49 men, 49 women). Their reported ethnicities were 142 White, 12 African American, 18 Asian American, 14 Hispanic/Latino, two Native American, one Middle Eastern, six biracial, and one “other.” The reported highest levels of education included 22 participants with a high school diploma, 63 with some college, 85 with a bachelor’s or associate’s degree, and 26 with an advanced degree (master’s or PhD). Finally, when asked about annual household income, 88 participants reported earning $20,000 or less; 47 reported earning $20,000–$40,000; 33 reported earning $40,000–$60,000; 11 reported earning $60,000–$80,000; and 17 reported earning $80,000 or higher. No demographic factor varied significantly by gender.

Procedure

Participants responded to a survey posted on Amazon’s Mechanical Turk, for which they were informed that they would be paid $0.55 for approximately 30 min of work, and that they might be contacted for an additional survey of similar length for $1. Participants interested in completing the second survey provided e-mail addresses and were contacted and completed the second session within one week of the first. This method of creating two sessions was chosen because research has shown that AMT users’ work quality often declines on longer surveys (see Goodman, Cryder, & Cheema, 2013). In addition, instructional manipulation checks (Oppenheimer, Meyvis, & Davidenko, 2009) were included in both phases, and data were only included in the analyses if participants answered all of these checks correctly. The first session was completed by 342 participants. Twenty-eight answered at least one attention-check item incorrectly, 16 did not provide complete data, and three self-identified as “transgender.” Of those invited back, 69 % (203/294) completed the second session; seven of these participants were excluded from the analyses for not answering attention-check items correctly on the second survey, and one more was excluded from the narrative analyses for providing narratives with questionable content, although this participant’s data were retained for the other analyses. No demographic differences were found between those who returned for a second session and those who did not. Thus, with 196 participants completing both sessions, we determined that adequate power had been achieved. Although there is no consensus on a hard and fast rule of thumb for observations per variable (e.g., age, gender, event, type) when using mixed models, the proposed power heuristics have ranged from 10 (Hofmann, 1997) to 50 (Maas & Hox, 2005) observations per variable for power over .90. Our most complex model has 21 variables in it (five main effects and 16 interaction terms). With 784 observations in the data set, this represented approximately 37 observations per variable.

The order of events in data collection was uniform for all participants. In the first session, participants completed an open-ended event (a specific event that had occurred in the past 2 years) and then a “high-point” event; in the second session, they wrote about a “low-point” event and a “self-defining memory” event (SDM). The high- and low-point event prompts were adapted from the life story interview (McAdams, 1997), and Singer and Salovey’s (1993) self-defining event prompt was used for the final narrative. Because the event prompts were considered to be progressively more explicitly self-reflective (i.e., from open-ended to high/low points to SDM), this order was maintained so that the self-focus of one narrative would not influence the way that a second narrative was reported, as was found by Grysman and Hudson (2011). Additionally, high points were always elicited at the end of the first session and low points at the beginning of the second session, so that a session did not end on the low-point event (for the sake of participants’ moods). It should be noted that the data reported in this article come from a data set on which other analyses have been performed, as reported by Grysman and Fivush (in press) and Grysman, Merrill, and Fivush (2016), but all data reported in the central analyses are unique and have not been reported elsewhere.

Personal Attributes Questionnaire, Short Form (PAQ-F and PAQ-M)

Two subscales from the PAQ, developed by Spence and Helmreich (1978), were completed by participants at the end of the second session, along with other demographic information. The feminine subscale (PAQ-F) includes eight socially desirable trait terms (emotional, devotes self, gentle, helpful, kind, understanding, aware of feelings, and warm) commonly associated with women and that broadly reflect interpersonal and expressive traits (Helmreich, Spence, & Wilhelm, 1981). The PAQ masculine subscale (PAQ-M) includes eight socially desirable trait terms (competitive, active, independent, decisive, never gives up, self-confident, feels superior, and stands up under pressure) that are commonly associated with men, including goal-oriented and instrumental traits (Helmreich et al., 1981). All scores are reported on a 1–9 scale, with higher scores indicating that the trait terms are more self-descriptive. Reliability for each scale was good: Cronbach’s α = .79 for the PAQ-F, and Cronbach’s α = .78 for the PAQ-M.

Content coding of narrative data

Coding schemes that have been implicated in gender differences were selected from the existing literature, and adapted for these narratives as described below. All coders established reliability on 10 % of the narratives (n = 80) before coding independently. Disagreements in reliability were resolved through discussion, and reliability statistics are reported in each coding description that follows. More detailed coding manuals may be obtained from the first and third authors.

Affect

Trained coders identified each instance of an affective state or emotion word. Using a coding scheme developed by Bauer, Stennes, and Haight (2003) and by Fivush et al. (2012), the affect variable included both positive statements, such as “That was an exciting time for us” and “I felt very special,” and negative statements, such as “It was really hard on him” and “She was really sorrowful.” This did not include trait terms (e.g., “he was a cheerful person”) or repetitions that did not add emphasis, but did include the negation of positive and negative terms (e.g., “I wasn’t happy” was coded as negative). Each instance was subcoded for valence and which person in the story was feeling the affect, but positive and negative terms were summed for the final analysis rather than considered separately (see Andrews, Zaman, Merrill, Duke, & Fivush, 2015; Bauer et al., 2003). Overall Cohen’s kappa for agreement on coding specific words and phrases as indicative of affect was .85. Frequency was calculated as the total number of words and phrases indicative of affect in each narrative.

Connectedness

Connectedness was developed from McAdams’s communion coding scales for life stories (Mansfield & McAdams, 1996; see Andrews et al., 2015, for an adaptation). The extent to which narratives included content about meaningful relationships was scored on a scale of 0 to 3. Narratives that included no mention of meaningful interactions with others received a 0. Those that included interactions with others but did not describe the way in which the relationships were meaningful received a 1. Those that included important people in the life of the narrator, but were ultimately more event-focused than relationship-focused overall, received a 2; and those narratives that took this a step further, to describe the way in which the relationship was important for the self, received a 3. The intraclass correlation between the two reliable coders was .88.

Agency

Agency was developed from the McAdams agency coding scales for life stories (Mansfield & McAdams, 1996; see Andrews et al., 2015, for an adaptation). The extent to which the narrator expressed a sense of autonomy and empowerment in the narrative was evaluated on a scale of 0 to 3. Narrators scoring a 0 described themselves as helpless or out of control during the event. Those scoring a 1 described either accommodating to a situation that was beyond their control or providing minimal information about agency. Narrators scoring a 2 provided information about how they pursued a goal, took control of a situation, or worked hard in an event. To score a 3, the narratives needed to include information about how the agency in the story generalized to a sense of self—for example, empowerment in the moment extending to how a person saw themselves (e.g., “I realized I was capable of anything after that.”) The intraclass correlation was .84.

Factual elaboration

The coding systems for factual and interpretive elaboration were based on the system for coding elaboration employed by Fivush et al. (2012) and Andrews et al. (2015), and on the distinction between factual and interpretative elaborations developed by Pasupathi and Wainryb (2010). The extent that the narrator expressed objective details of the context, including who, what, when, where, and how actions physically unfolded, was scored on a scale of 0 to 3. Narratives scoring a 0 included very few actions and were not followed up by objective details. Those scoring a 1 included multiple actions but were not followed up by much objective detail. Those scoring a 2 included many actions and how they unfolded, but not with much objective detail. The narratives scoring a 3 included many actions, how they unfolded, and a rich description of the physical aspects of where and when the event took place. The intraclass correlation was .92.

Interpretative elaboration

The extent that the narrator expressed subjective details of the context, including thoughts, emotions, beliefs, and reasoning about the event, was scored on a scale of 0 to 3. Narratives scoring a 0 included very few thoughts or feelings and were not followed up by many subjective details. Those scoring a 1 included multiple thoughts or feelings but were not followed up by many subjective details. Those scoring a 2 included many thoughts and feelings and how each was reasoned to be causally connected, but the narratives did not provide rich detail (e.g., analogies, conditional propositions, and reasoning about beliefs). Those scoring a 3 included many thoughts and feelings, how they were causally connected, and rich description of the subjective aspects of what happened and what it meant for the narrator. The intraclass correlation was .87.

Coherence

We employed the theme coherence dimension of the NaCCs coding scheme (Reese et al., 2011). The full scheme comprises three dimensions—context, chronology, and theme—each based on a scale of 0–3. As was demonstrated by Reese et al., context and chronology develop across childhood, reaching adult levels by middle adolescence, whereas theme coherence continues to show variability through adulthood. Theme coherence involves the extent to which the narrator stays on topic, provides explicit links between actions, and provides a resolution to the event or links it to other memories from the life story. Zero indicated a lack of any coherence element, whereas 3 indicated high levels of coherence across these elements. Because only theme coherence consistently shows variability in adulthood, we used only this dimension in this study. The intraclass correlation was .83.

Results

Preliminary analyses

Descriptive statistics for narrative coding are reported by event types in Table 1. Affect codes were moderately skewed, as were agency and coherence ratings in the open-ended event, but the remainder of the coding data were distributed evenly across the sample. Repeated measures analyses of variance with event type as an independent variable showed that all narrative coding systems differed by event types (ps < .001, η p 2 > .05), except for factual elaboration (p = .12). Furthermore, Table 1 shows that the open-ended narrative prompt differed significantly from every other narrative prompt on all coding systems except for factual elaboration, confirming that this narrative prompt was a useful comparison group for the emotional and self-defining prompts.

Table 1 Means [with 95 % confidence intervals] for the six narrative coding systems by event type

Second, as was reported by Grysman and Fivush (in press), who examined gender typicality in relation to self-reported memory characteristics, a preliminary analysis of PAQ scores by age group and gender showed that men scored higher than women on the PAQ-M, F(1, 192) = 14.77, p < .001, η p 2 = .07; conversely, women scored higher than men on the PAQ-F, F(1, 192) = 8.24, p = .005, η p 2 = .04, but this effect was moderated by age group, F(1, 192) = 11.22, p = .001, η p 2 = .06. Figure 1 indicates that only early-adult women scored higher than men on the PAQ-F, whereas emerging-adult men and women provided similar responses to this scale. This finding emphasizes the importance of examining age group and gender typicality as potential moderators in analyses of narrative content, because it demonstrates that personal identification with certain traits associated with gender does not show a uniform pattern across age groups.

Fig. 1
figure 1

Scores on the PAQ-F and PAQ-M, separated by gender and age group. Error bars represent 95 % confidence intervals, and asterisks indicate significant (p < .05) group differences

Central analyses

Analyses were conducted to answer the following hypotheses regarding the six narrative dimensions. First, we hypothesized that categorical gender would influence scores on these dimensions. Second, would effects of gender, when present, remain predictive of these narrative dimensions when accounting for age group, gender identity, and event type? Third, would effects of gender be moderated by age or gender identity? Finally, would any of the effects found be moderated by event type? A mixed-model approach allowed for all of these questions to be addressed in a series of model-fitting analyses applied to each of the six narrative dimensions. The advantages of a mixed-model approach over multivariate analysis of covariance are the ability to compare the fittings of various models and the ability to test for interactions between independent variables and covariates, both between and within subjects.

For each narrative dimension, a repeated measures mixed-model analysis was conducted using SPSS. For all analyses, the between-subjects independent variables available included gender, age group, and PAQ-M and PAQ-F scores (as covariates), and a within-subjects independent variable of event type. To achieve this control for repeated measures, participant ID and intercept were entered into all analyses as random effects. Preliminary analyses indicated that including participant ID as a random effect was an appropriate choice to control for repeated measures, since tests of the covariance parameters indicated that covariance based on individual participants’ scores was significant for five of the six narrative dependent variables, ranging from Wald’s Z = 2.19, p = .03, to Wald’s Z = 6.64, p < .001, with the sixth dependent variable, agency, achieving marginal significance, Wald’s Z = 1.81, p = .07. Additionally, to avoid the measurement problems that might emerge from interacting covariates, preliminary analyses screened for interactions between the covariates and revealed none. Finally, because six models were tested, correlations between the narrative dimensions were conducted to ensure that the tests were not redundant. No correlation was above r = .26, and thus the analyses proceeded without concerns for multicollinearity.

Four models were tested to correspond to the four hypotheses. The fixed effects that emerged in the first two models of the analysis are reported in Table 2, which also includes the Akaike information criterion (AIC) and Schwartz Bayesian information criterion (BIC) values for all four models. These two metrics are reported in smaller-is-better format, such that lower values indicate a closer fit between the amount of data explained and the number of variables used. Additionally, a difference between models of less than two points is considered small, and more than two points is considered indicative of a better balance between complexity and model fit (Seltman, 2009). When significant effects emerged, the parameter estimates with 95 % confidence intervals are reported in the text, with positive β values representing positive predictive relations, and negative β values representing negative predictive relations.

Table 2 Mixed-model analysis, including F values for all fixed effects in the first two models and all significant interactions in Models 3 and 4

To test the first hypothesis, Model 1 included only gender as a fixed effect. As can be seen from Table 3, women’s narratives were scored higher than men’s on three narrative coding systems, including connectedness, β = 0.37, 95 % CI [0.22, 0.52]; factual elaboration, β = 0.23, 95 % CI [0.04, 0.42]; and affect, β = 0.68, 95 % CI [0.17, 1.18].

Table 3 Means [with 95 % confidence intervals] for the six narrative variables, presented by gender and event type

Model 2 was conducted to include the effects of gender, age group, PAQ-M, PAQ-F, and event type as fixed effects. As can be seen in Table 2, improvements in model fit are reflected by decreases in the AIC and BIC scores on all narrative dimensions except factual elaboration, reflecting consistent effects of event type, as displayed in Table 1. In other words, gender and event type predicted unique variance on these narrative dimensions. Two of the three gender effects were reduced to marginal significance but not mediated entirely. Because Model 1 accounted for repeated measures as a random effect, these reductions can be attributed to inclusion of the between-subjects variables rather than to the inclusion of event type. PAQ-F emerged as a significant covariate on connectedness ratings, β = 0.10, 95 % CI [0.03, 0.17]. PAQ-M emerged as a significant covariate on connectedness, β = –0.06, 95 % CI [–0.12, –0.003]; agency, β = 0.08, 95 % CI [0.03, 0.12]; and as a marginal covariate on affect scores, β = –0.19, 95 % CI [–0.40, 0.008]. Age group did not emerge as a significant variable for any of the analyses. In sum, Model 2 showed gender-related effects, including gender and the two gender typicality scores, on four of the narrative dimensions. Specifically, higher endorsement of feminine-typical traits was associated with narratives more commonly expressing themes of connectedness; higher endorsement of masculine-typical traits was associated with narratives more commonly expressing agency and less commonly expressing connectedness and affect. Effects of event type were common, as expected, but no effects of age group emerged.

Model 3 tested for interactions between gender and the remaining independent variables. Thus, Model 3, in addition to the five main effects of Model 2, included interactions of gender with age group, PAQ-M, PAQ-F, and event type as fixed effects. Table 2 shows that for all dependent variables other than affect, AIC and BIC scores rose substantially, suggesting that the model fit was not improved from Model 2 by including these interaction terms. Additionally, across all six variables, including affect, not one interaction term emerged as significant. Thus, the values from Model 3 are not included in Table 2.

In Model 4, interactions with event type were tested, and thus the model included the five main effects of Model 2, plus the interactions of PAQ-M, PAQ-F, and age group with event type. An interaction with gender was not included because it had not yielded any significant interactions in Model 3. As can be seen in Table 2, Model 4 only improved over Model 2 for affect, but for none of the five remaining dependent variables. However, one significant interaction and two marginally significant interactions between PAQ-M and event type emerged in Model 4, and so they are reported in Table 2. No other significant interactions were found in Model 4. Because β values produced by mixed models for interactions represent comparisons of the effects of the covariate on one event type versus another and are not independently meaningful (Seltman, 2009), these results are presented as correlations with event type rather than parameter estimates. PAQ-M predicted connectedness in the high-point events, r(194) = –.29, p < .001, but not in the other event types. PAQ-M also predicted affect in the high-point events, r(194) = –.24, p = .001, and marginally predicted affect in the low-point events, r(194) = –.14, p = .059. PAQ-M scores predicted agency ratings in the self-defining event, r(194) = .20, p = .004, and marginally predicted agency ratings in the open-ended event, r(194) = .14, p = .059.

In sum, a mixed-model analysis indicated that categorical gender predicted connectedness, affect, and factual elaboration. Concurrently, PAQ-F scores predicted connectedness, and PAQ-M scores predicted connectedness, affect, and agency, although all three effects of PAQ-M were moderated by event type. Beyond these three interactions between PAQ-M and event type, moderation effects were not present in the analysis when accounting for all possible interactions by gender and by event type.

Discussion

In this study, we systematically examined the effects of gender, gender typicality, age, event type, and narrative dimension in order to more fully explore gender differences in autobiographical memory narratives. In a data set that included 784 personal event narratives, gender and gender typicality consistently predicted memory narrative content in theoretically anticipated ways. Women’s autobiographical narratives included more factual elaboration, more affect language, and more commonly included themes of connectedness than did men’s narratives. These patterns remained significant across event types and age groups, and when accounting for gender typicality. Scales of two types of gender typicality (i.e., agency and communion) were also predictive of narrative content: Higher communion scores (PAQ-F) predicted higher connectedness in memory narratives, and higher agency scores (PAQ-M) predicted more expressions of agency and fewer expressions of affect and connectedness in memory narratives, but all three of these effects were moderated by event type.

Effects of gender and feminine gender typicality

As predicted, women’s narratives scored higher than men’s on ratings of connectedness, affect usage, and factual elaboration. Notably, although these variables are commonly associated with women’s greater interrelatedness with others (and thus with greater emotional expression and talkativeness), scores on the autobiographical memory narratives were linked to gender independent of PAQ-F scores, with PAQ-F scores also predicting connectedness ratings but not affect or factual elaboration. This pattern emphasizes the fact that multiple gender-related factors influence autobiographical recall. As children, girls are more commonly exposed to settings in which they are encouraged to talk about events and their emotions in pairs or small groups than are boys, both through maternal scaffolding (Fivush, Berlin, Sales, Mennuti-Washburn, & Cassidy, 2003) and through gender-differentiated play patterns (Rose & Rudolph, 2006). We suggest that the effects of categorical gender reflect the reality of girls’ consistent and implicit exposure to cultural norms, but we also note the presence of effects of cultural gender norms on narrative.

This interpretation is supported by the lack of any age effects on the narrative variables. The fact that the gender-based differences in the narrative variables remain constant despite changes in self-reported subscription to gender-typical traits with age suggests that the narrative variables reflect a gendered way of remembering that surpasses other potential influences on recall and gender typicality.

Furthermore, the fact that gender differences in feminine gender typicality were moderated by age warrants further research and supports the methods employed here. First, it demonstrates that “femininity” cannot be defined uniformly and assumed to be applicable to all women and men at all stages of life. Second, it emphasizes that emerging adulthood is a unique developmental stage with regard to gender, and research with undergraduates should carefully consider gender typicality or other relevant gender-related variables when conducting gender-related autobiographical memory research (see Grysman & Fivush, in press). We suggest that gender differences in typicality are minimized among emerging adults because they are focused on education and professional goals (Kroger, 2003), and that in young adulthood, a focus on family and on parenting leads to greater differentiation between men and women (Katz-Wise et al., 2010; LaChance-Grzela & Brouchard, 2010).

Additionally, we want to emphasize that our use of the word “typicality” is meant to be descriptive rather than prescriptive. We make no claim in this article about how or whether individuals should identify with respect to the construct of gender. Instead, we used an existing measure to identify specific traits and explored the relevance of these traits to autobiographical memory. By analyzing multiple ways in which people define themselves in terms of these traits as a means of self-definition and self-understanding, a deeper appreciation of how gender influences behavior can be obtained.

Exploring all of these variables and looking at the pathways through which gender affects narrative is a promising avenue to understand the multifaceted effects of gender on narratives and the nuanced ways they might be conditionally expressed. The simultaneous presence of independent effects of categorical gender and feminine gender norms (on connectedness) suggests that both growing up female and an individual’s identification with gender-typical characteristics independently and uniquely influence the way that an event is recalled and retold. Future work in this domain should consider the roles of other potential contributors to gender identity (e.g., perceived gender normativity) or of differences in upbringing (e.g., maternal reminiscing style) to deepen our understanding of the source of this gender difference and to consider whether it remains in the presence of all other potential moderators. The fact that the effects of gender and gender typicality were not moderated by event type suggests that they are fundamental to how females recall and interpret their experiences, rather than depending on the specifics of certain events or the selection of one experience in response to a particular prompt.

That women’s narratives were scored higher than men’s on factual elaboration but not interpretive elaboration was surprising, because gender effects on both of these variables were predicted. In the past, global ratings of elaboration (as opposed to Pasupathi & Wainryb’s, 2010, propositional coding) have not commonly distinguished between factual and interpretive elaboration, combining both in one rating scale on which females tend to score higher than males (e.g., Andrews et al., 2015; Fivush et al., 2012; Zaman & Fivush, 2013). Very few studies have examined specific instances of factual and interpretive detail in memory narratives. Pasupathi and Wainryb (2010) examined narrative data from children ages 5–16 and found that girls reported memory narratives with both more factual and more interpretive elaboration than boys in adolescence. However, with emerging adults, Grysman and Hudson (2011) found that women provided more factual but not more interpretive elaborations than men. Grysman and Denney (2016) compared verbally reported memory narratives to typed narratives, and found that gender differences in interpretive elaboration more commonly emerged in the verbal than in the typed narratives, though at times differences emerged in unpredicted directions. Additionally, Grysman and Denney did not find gender differences in factual elaboration, suggesting inconsistency across forms of measurement and methods of data collection. In all, then, the literature comparing factual to interpretive elaboration is mixed. Especially given the findings of Grysman and Denney, which suggest that gender differences emerge more commonly in spoken than in typed narratives, a closer evaluation of various media and of these constructs will be necessary before firm conclusions can be made about the roles of various types of elaboration in memory. Still, when gender differences are found, it is almost always the case that females are more elaborative than males. As we have demonstrated here, this gender difference holds across a wider variety of events and a wider age span than has previously been studied. However, our findings underscore the need to more carefully consider the type of elaboration, as well as the elicitation context, in future studies.

Effects of masculine gender typicality

The effect of masculine gender typicality, a measure of instrumentality or agency, on autobiographical narratives stands in stark contrast to the consistency across event types of the effects of gender and feminine gender identity. Effects of masculine gender identity emerged on three measures, but were moderated by event type in all three. Specifically, higher PAQ-M scores predicted lower connectedness and affect in high-point events and marginally lower affect in low-point events, but predicted higher agency in self-defining and open-ended events. These interactions were not predicted, but suggest that, whereas gender and feminine gender typicality effects seem to reflect a general mode of processing or recalling an event that functions independent of the event being reported, masculine gender typicality effects are more targeted. Highly emotional events for highly agentic individuals are typified by less connectedness to others and less use of affect to describe their experiences, when compared to low-agentic individuals; conversely, events that are considered self-defining, or simply open events chosen by the participant, are more open to expressions of agency. It is possible that the effects of masculine gender identity find expression via participants selecting memories relevant to agency when guided toward self-definition (or when not guided toward an emotional cue) because agency is so integrally a part of the self-definition for someone who is highly agentic, and thus scores high on this measure of masculine typicality. Previous research has revealed that highly agentic participants were more likely to choose events that exemplified their agency, in studies using open-ended prompts (Markus, 1977; McAdams, 1982; Woike, Gershkovich, Piorkowski, & Polo, 1999).

What is perhaps most interesting in our data is that, for females, there seems to be a more consistent narrative style that conforms to feminine gender typicality across event types, whereas for males, gender and gender typicality are expressed more in some contexts than in others. This pattern calls for replication and extension, as it has important implications for how we understand gendered autobiographical memory.

Narrative coherence

Notable among our findings is the lack of effects of gender or gender typicality on the thematic coherence of narratives. As we argued in the introduction, coherence is a basic cognitive linguistic skill that shows a clear developmental trajectory and allows individuals to construct structurally comprehensible accounts of their experiences. Thus, we neither expected gender differences nor observed them in this data set. We emphasize that this is critically important, as interpretations of gender differences in autobiographical recall rely as much on obtaining predicted differences as on not obtaining differences that are not theoretically expected.

Event type

Narrative coding revealed differences between event types on five of the six narrative dimensions. The most common pattern, observed for connectedness, affect, interpretive elaboration, and coherence, was a steady increase from the open-ended prompt to the self-defining event, with high-point and low-point narratives not differing. These differences justify the choice to collect data in the order designed. This pattern signifies the importance of carefully selecting the memory elicitations used in a research design, as they elicit varied narrative characteristics. Grysman and Hudson (2013) suggested that event type may confound research on gender because eliciting emotion-laden narratives among men may elide gender differences more than other narrative elicitations. The findings of this article show that gender effects remain even in emotional events. However, effects of masculine typicality did vary by event type, further confirming the value of varying the event type elicited and of carefully considering narrative elicitation in future research.

Summary and conclusions

The pattern of findings obtained here points to the complexity of the roles of both categorical gender and gender typicality in autobiographical memory. By using a wider variety of event prompts, narrative dimensions, and age spans than previous literature, this study began to dissect the complexity of gendered autobiography. Importantly, both female gender and feminine gender typicality played a role in how elaborative, affectively laden, and connected autobiographical narratives were, and these effects held across event types. In contrast, masculine gender typicality was related to the expression of agency, connectedness, and affect for some events but not for others. This pattern suggests that females have a more consistent autobiographical narrative style than do males. As Fivush and Zaman (2014) argued, autobiographical reminiscing is, itself, a feminine-typed activity; females engage in this activity more frequently, value the activity more, and use it to create intimacy more than males. Thus, simply growing up in a gendered world and being female creates an environment in which these skills develop more for females than for males. Yet feminine gender typicality plays a role as well, suggesting that both implicit gendered socialization and more explicit self-definition in gendered ways each contribute to creating more gendered autobiographies. For males, in contrast, the effects are more contextually constrained, suggesting that this does not represent a more gendered style of autobiographical narration, but rather that specific prompts pull for a more gendered expression. Thus, autobiographical reminiscing per se does not pull for male gender, but specific types of recalled events do. Clearly, these results both clarify some of the discrepant findings in the literature and raise additional questions. In particular, the pathways and conditions through which gender affects narrative expression are variegated. The inclusion of gender typicality, event type, and age factors in more complex models may elucidate important findings and obtain a deeper understanding of the mechanisms involved in gendered autobiographical memory.