Introduction

In recent decades, South Korea (hereafter, Korea) has received considerable attention for its low period total fertility rates (Jones et al., 2009; Yoon, 2022). This trend mirrors that of other East Asian societies, including Hong Kong, Japan, Singapore, Taiwan and China, which have also faced ‘lowest-low’ or ‘ultra-low’ fertility (Jones et al., 2009; Bumpass et al., 2009; Morgan et al., 2009). Nonetheless, Korea underwent a relatively more rapid transition from 6.1 children per woman in 1960 to replacement level in 1983, followed by a precipitous decline to 1.3 in 2001 (World Bank, 2022). Since then, Korea’s fertility rate has been on a consistent downward trend, reaching a low of 0.8 in 2021 (Korean Statistical Information Service, 2023).

Macro-level analyses mainly attribute East Asia’s ultra-low fertility rates to shifts in marriage and childbearing patterns (Frejka et al., 2010; Jones, 2007). The notable rise in women’s average ages at first marriage and childbirth, respectively, has led to a marked ‘postponement transition’ since the 1980s (Kohler et al., 2002; Yoo, 2016). In Korea, between the early 1990s and 2022, women’s average age at first marriage increased from 24.7 to 33.5 years, and the age at first childbirth rose from 26.2 to 31.3 (Korean Statistical Information Service, 2023). In contrast to the Western upward trend in non-marital childbearing (Perelli-Harris et al., 2012; Coleman, 2013; Thomson, 2014), these patterns of delayed family formation and the continuing link between marriage and childbearing will likely continue, coupled with a growing trend towards singlehood, particularly among younger cohorts (Lee et al., 2021; Jones & Gubhaju, 2009; Rindfuss & Choe, 2015; for a review, see Raymo et al., 2015). Thus, growing interest in understanding the social underpinnings of family behaviour and formation patterns characterises the Korean context.

Previous research has highlighted many personal factors potentially contributing to the decline in Korea’s fertility rates, ranging from education expenses and economic downturns to access to the labour market and the dynamics of gender attitudes, norms and roles (Anderson & Kohler, 2013; Chang et al., 2023; Kim, 2017; Lee, 2019; Lee et al., 2021; Park, 2013; Yoo, 2014; Tan, 2022; Tan et al., 2016). A growing body of research has highlighted the importance of social factors in shaping reproductive choices. Studies have suggested the deep roots of reproductive decisions in the family lineage, with childhood family socioeconomic status and upbringing relating most closely to marriage and childbirth decisions (Barber, 2000; Kolk, 2014; Rotering, 2017; Park, 2013; Tan, 2022). In the Korean context, such cultural factors as the responsibilities of the eldest daughter and social obligations that differ between rural and urban women further elucidate these complex social processes (Tan, 2022; Rindfuss & Hirschman, 1984).

However, the understanding of Korea’s fertility patterns has been limited by the reliance on two empirical approaches. On the macro level, population-level analysis focuses mainly on producing evidence of population trends, computing changes in aggregate-level components and relating them to the country’s fertility rate (Jones, 2007; Yoo, 2014). While formal demographic techniques, such as decomposition and standardisation, provide important information on the relative contribution of aggregate factors to Korea’s fertility rate (Yoo, 2014; Yoo & Sobotka, 2018), they are less suitable for exploring the complex associations between one’s life experiences and fertility. On the micro level, survey-based approaches offer a more detailed exploration of the link between individual social dynamics and family-related outcomes (Kim, 2017). Methods like event-based/multivariate regression could complement macro-level analysis by integrating various socioeconomic factors into an understanding of fertility variations. Nevertheless, their scope often stops at one specific transition at a particular point in time, offering a less flexible means of holistically capturing the unfolding of diverse family formation pathways over the longer life course.

Recent discussions have highlighted the merits of using sequence analysis to comprehensively map family trajectories, treating various sequences in relation to marriage and parenthood as an integrated whole (Abbott & Tsay, 2000; Billari, 2005). Particularly when combined with regression-based techniques, this alternative approach offers a more flexible means of estimating the complex interplay between one’s early socioeconomic position or family background and subsequent family formation patterns (Elder, 1994; Huinink & Kohli, 2014). It enables a deeper understanding of the timing, sequencing and quantum of multiple family formation events and the broader implications of childhood factors on fertility. This integrative method, adopting a life course perspective, may provide a more holistic understanding of fertility variations that could serve as turning points leading to a redirection of family pathways through a life course lens.

In this technical research note, we aim to illustrate the effectiveness of sequence analysis as a method of enabling comprehensive modelling of family formation trajectories, their diversity and the influence of early-life factors on the diverging pathways individuals may take. This more inductive approach to family trajectories allows us to better estimate the complex processes and variations within family formation, serving as a valuable complement to the macro-level analysis that predominates in studies of Korea’s fertility patterns. Findings from this study also resonate with the growing understanding of life course influences on family trajectories, a method that Western-based research had predominantly applied (Balbo et al., 2013; Buhr & Huinink, 2014). We extend this method to a lesser-studied East Asian context, featuring considerably different cultures and patterns of marriage and fertility (Du, 2023; Lim, 2021; Park, 2021; Xu et al., 2020; Yi & Cai, 2023).

Data

We used data from the Korean Longitudinal Survey of Women and Families (KLoWF), a nationally representative panel survey of Korean women that collects information on their marital status, fertility behaviour, family background and other sociodemographic characteristics. A total of 14,699 women were surveyed between 2007 and 2020. The first interview in 2007 (wave 1) preceded the second in 2008 (wave 2), the third in 2010 (wave 3), then biennial surveys until 2020 (wave 8). We combined women’s birth history and individual-level data to capture each respondent’s fertility and marital information.

We selected women born between 1940 and 1979, splitting them into four 10-year cohorts: 1940–1949, 1950–1959, 1960–1969 and 1970–1979. We note that the 1970–1979 cohort, who were 41–50 years old in 2020, may not have fully completed their fertility. While the data may underestimate the number of births occurring after age 41 for this group, the overall impact of potential births is likely unsubstantial because only approximately 5% of births in 2021 were to women aged 40 or over (Korean Statistical Information Service, 2023). Excluding women born during or after 1980 reduced the sample to 10,789 women.

We used information on the birth year for each child and the year that marital status changed to construct the women’s marital and birth histories over their entire reproductive life course from age 13 to 45. We sequenced the number of children as a count variable ranging from 0 to 9. Marital status included four categories: never married, married, separated or divorced and widowed. Using the two variables generated eight mutually exclusive family formation statuses: (1) never married and childless, (2) never married with one or more children, (3) married and childless, (4) married with one child, (5) married with two children, (6) married with three children, (7) married with four or more children and (8) separated, divorced or widowed with zero or more children.

Next, we introduced a range of predictors to examine the association between women’s childhood background and subsequent family formation trajectories. These included their birthplace, whether they grew up in a two-parent family, parental occupation and education, number of siblings and birth order. We binary coded birthplace as ‘metropolitan area’ (Busan, Daegu, Daejeon, Gwangju, Incheon, Seoul and Ulsan) or ‘non-metropolitan area’ (the other nine provinces). We used a binary variable to indicate whether the respondent lived with both mother and father at age 15. We categorised parental occupation (i.e. mother’s and father’s respective jobs when the respondent was 15 years old) according to the International Standard Classification of Occupations (ISCO-08)—that is, using three categories: skill level 3 or 4 (advanced and specialised skills), skill level 2 (intermediate skills) and skill level 1 or other (basic skills). Parental education is binary coded (i.e. at least high school or higher education, or not). Number of siblings is a continuous variable, ranging from 0 to 15, as is birth order, ranging from 1 to 12.

Methods

Previous studies had often treated marital and birth events as single, static family transitions, an approach we now recognise as inadequate for capturing the dynamic nature of women’s family formation pathways (Billari et al., 2006; Billari, 2015). Adopting a life course framework, sequence analysis emerged as a powerful tool that aimed to capture the timing, sequencing and frequency of events, thus offering a flexible yet holistic view of an individual’s life trajectory (Abbott, 1995). This method surpassed the limitations of event-history analysis by considering the interconnectedness of life events, rather than viewing them in isolation (Aisenbrey & Fasang, 2010). Recent advancements in sequence analysis have mapped different life events to reveal diverse pathways of cohabitation experiences (Di Giulio et al., 2019), teenage parenthood (Kalucza et al., 2020), migrant family formation (Kraus, 2019), family trajectories among lesbian-gay-bisexual (LGB) groups (Ophir et al., 2023) and late-employment couples with family formation in adulthood (Wahrendorf et al., 2017). Specifically, through trajectory clustering, this method reduces the complexity of summarising diverse life courses into simplified typologies while retaining insights into the duration and nature of each phase within these trajectories.

Here, we applied sequence analysis to identify family formation trajectories of Korean women. After aligning their yearly family formation status from ages 13 to 45, we provided a descriptive overview of family formation trends over time, by generating chronograms that summarised the marital and fertility sequences for each birth cohort. Following the cohort description, we performed optimal matching analysis (OMA), using the Needleman-Wunsch algorithm to compute a dissimilarity matrix that measures the distances between sequences on the basis of minimum substitution, insertion and deletion costs of transforming one sequence to another (Needleman & Wunsch, 1970). This approach takes into account both similarities and differences between the sequences (Abbott, 1995; Abbott & Tsay, 2000). In this process, matches between corresponding elements in the sequences typically receive a positive score, while mismatches and gaps incur penalties, often represented by negative scores; the fewer operations required, the greater was the similarity between the two sequences (Pearson & Miller, 1992). We set equal costs for all operations because, in our case, theoretically or heuristically derived costs may be arbitrary. We obtained similar results using dynamic hamming distances and transition-based substitution cost in our sensitivity analyses.

Following the OMA results, we used Ward’s hierarchical clustering method to classify similar sequences into groups. This method supplements the aggregation of similar sequences based on the dissimilarity matrix derived from the OMA. To determine the optimal number of clusters, we compared different cluster-fit solutions by assessing Calinski and Harabasz’s pseudo-F statistic, Duda and Hart’s indices and the cluster dendrogram (see Table A1 and Figure A1 in the Appendix). These statistical measures help to assess the quality of clustering solutions and identify the number of clusters that best fit the data. The dendrogram visually organises clusters hierarchically, with individual data points at the base and clear separations between clusters in its branches. We used chronograms to illustrate the marital and fertility sequences for different clusters. Descriptive statistics for each cluster are provided in Table 1.

To examine the association between childhood background and family formation trajectories, we conducted a multinomial logistic regression, using the covariates to predict cluster membership. The results are presented using relative risk ratios (RRR). We carried out all analyses in Stata 17, using the SADI (Halpin, 2017) and SQ (Brzinsky-Fay et al., 2006) packages for sequence analysis.

Results

Overall, Korean women are having later and fewer marriages and births across cohorts (Fig. 1). The mean age of first marriage increased from 22.4 for the 1940–1949 cohort to 26.3 years for the 1970–1979 cohort. Similarly, the mean age of first birth increased from 23 to 27 years, with the mean number of children declining from 3 to 1.7. The cohort patterns suggest that women born in the same period may share similar sociohistorical experiences, prospects and norms, commensurate with their family formation pathways. For example, younger cohorts born during and after Korea’s rapid economic development in the 1960s appear to shift toward a delayed and declining marriage and fertility trajectory.

Fig. 1
figure 1

Cohort progression for family formation trajectories

Next, we used cluster analysis to group the trajectories into clusters. We selected the six-cluster solution based on the fit statistics (Fig. 2). The Duda and Hart’s pseudo-T2 for six clusters (1.05) was relatively low, indicating favourable compactness and separation of clusters compared to neighbouring cluster solutions (Appendix Table A1). Similarly, the Calinski and Harabasz’s pseudo-F statistic (3,409.79) for six clusters showed notable improvement compared to the five-cluster solution (3,519.64). The dendrogram visually illustrated six distinct branches or levels of separation, which confirms the presence of six clusters within the data (Appendix Figure A1). In addition, each cluster also provides a substantive interpretation of the timing, progression, and number of events that clearly differentiate the diverse family formation pathways over time.

A gradient is evident across the six clusters: (1) ‘Non-marriage–childless (5%)’, (2) ‘Late marriage–one child (16%)’, (3) ‘Late marriage–two children (28%)’, (4) ‘Early marriage–two children (25%)’, (5) ‘Early marriage–three children (19%)’ and (6) ‘Early marriage–more children (7%)’.Footnote 1

Fig. 2
figure 2

Family formation trajectories by identified clusters

Descriptively, as we progress from the ‘Non-marriage–childless’ cluster to the ‘Early marriage–more children’ cluster, the mean age at first marriage and the mean age at first birth decrease, while the mean number of siblings and the mean birth order increase (Table 1). Women from older birth cohorts appeared likely to be categorised into clusters associated with earlier marriage and childbearing, as indicated by a moderate relationship between women’s birth cohort and their cluster assignment (Cramér’s V = 0.31). Moreover, women who were born in non-metropolitan areas, with less educated parents, and whose parents held occupations requiring basic and elementary skills (skill level 1) appear to report earlier marriage and childbearing patterns.

Table 1 Descriptive statistics by cluster membership

Beyond observing and describing trends in social and demographic factors related to family formation, our analysis investigates the statistical relationships among these factors. To examine the association between women’s family background and cluster membership, Table 2 shows the results from multinomial logistic regression by cluster. Compared with younger cohorts, women in older cohorts were more likely to experience earlier marriages and have more children. Compared to women born in non-metropolitan areas, those born in metropolitan areas reported a relatively lower risk of following the ‘Late marriage–one child’ (RRR = 0.7, p < 0.001), ‘Late marriage–two children’ (RRR = 0.62, p < 0.001), ‘Early marriage–two children’ (RRR = 0.46, p < 0.001), ‘Early marriage–three children’ (RRR = 0.41, p < 0.001) and ‘Early marriage–more children’ (RRR = 0.25, p < 0.001) clusters than the ‘Non-marriage–childless’ cluster. Relative to those who grew up with two parents, women who grew up with one or no parent had a higher relative risk of being in the ‘Late marriage–two children’ (RRR = 1.55, p = 0.03), ‘Early marriage–two children’ (RRR = 1.69, p = 0.01), ‘Early marriage–three children’ (RRR = 2.03, p = 0.001) and ‘Early marriage–more children’ (RRR = 2.9, p < 0.001) clusters than in the ‘Non-marriage–childless’ cluster.

Table 2 Multinomial logistic regression predicting cluster membership

Compared to women whose father did not attain a high school education or above, those whose father had at least a high school education had a lower relative risk of being in the ‘Early marriage–two children’ (RRR = 0.73, p = 0.02), ‘Early marriage–three children’ (RRR = 0.74, p = 0.03) and ‘Early marriage–more children’ (RRR = 0.46, p < 0.001) clusters than in the ‘Non-marriage–childless’ cluster. Women with relatively more siblings were more likely to follow pathways characterised by marriage and having children than those who had no siblings.

Conclusions

Using a life course framework, this study demonstrated the utility of sequence analysis to more holistically unpack the complex family formation trajectories of Korean women and the association between their childhood background and subsequent family pathways. Across the 1940s-to-1970s cohorts, the results show a gradual shift toward later and fewer marriages and childbearing. We identified six clusters of family formation trajectories. In our analytic sample, the three clusters with the highest group prevalence were the ‘Late marriage–two children (28%)’ cluster, the ‘Early marriage–two children (25%)’ cluster and the ‘Early marriage–three children (19%)’ cluster, implying a predominance of the two-child norm among women born between 1940 and 1970. However, combining the initial cohort results with recent evidence suggesting a continuing decline in first and second births in Korea (Yoo & Sobotka, 2018), the prevalence of the ‘Non-marriage–childless’ and ‘Late marriage–one child’ clusters (Clusters 1 and 2) may eventually increase with younger cohorts. Future research should continue to monitor the change in timing, sequencing and quantum of family formation for younger cohorts when the data becomes available.

Notably, we observed variations in women’s subsequent family formation trajectories by their childhood socioeconomic background and other cultural factors. Consistent with findings in Western (Schoen et al., 2009; Sironi et al., 2015) and other East Asian (Tan, 2023) societies, growing up with parents of higher socioeconomic status and a smaller family size relates to pathways with delayed and declining marriage and fertility. Individuals’ birthplace also matters. Women born in urban areas were more likely to experience delayed family formation and lower fertility levels than their counterparts born in rural areas. This corroborates previous research showing similar variations in fertility between urban and rural areas, attributable in part to cultural norms in different environments (Kulu, 2013). Considering that Korea has the fourth-largest urban population among the OECD countries, regional demographic disparities may intensify as the urban population continues to grow (Organisation for Economic Cooperation and Development, 2021). As places differ in the costs and incentives of the people there to have children, more effort could help women achieve their desired number of children at different life stages, particularly where the costs of raising children are increasing. While the results have only scratched the surface of the complex link between intergenerational relationships and family formation pathways, they support the idea of linking parents’ lives to their children’s family pathway over the later life course.

While this study demonstrates the utility of integrating life course and sequence frameworks to understand the dynamic nature of marriage and fertility patterns, it does have notable limitations that may offer avenues for future work. First, our empirical approach was unable to account for the changing nature of family experiences and their impact on family formation pathways. We measured only women’s childhood family structure and parental occupation at age 15. While these measures commonly function as a proxy for an individual’s childhood background, especially in stratification research (see, for example, Bloome, 2017), they do not capture the time-varying effects of parental socioeconomic status or family transitions over the earlier life course. Second, further studies on related life course changes can illuminate the influence of individual characteristics or other adulthood transitions on family formation pathways. Specifically, follow-up research could examine structures of opportunities and constraints in greater depth. For example, better contextualising the variations in family formation behaviour requires understanding housing, education or employment structures in different areas. Third, longitudinal weights were not applied in our analysis because the baseline survey provided most of the data on women’s marital and birth histories. We therefore acknowledge potential biases related to mortality selection among older cohorts and incomplete family formation among younger cohorts. However, applying selective weights in these cases might result in disproportionate weighting. Future research could focus on specific cohorts to gain deeper insights into demographic dynamics within each group. Finally, the data we used captured only the experiences of women. Further exploration of couples’ experiences is crucial for a more complete understanding of family formation in Korea.

Despite these limitations, the study contributes to the existing literature by implementing an alternative empirical framework to more precisely estimate Korean women’s family formation trajectories and examine the implications of initial sociohistorical contexts for eventual pathways. Instead of analysing marital or fertility events in aggregate terms or at specific points in time, sequence analysis allows us to perceive continuity in marriage and childbearing processes. By grouping the diverse trajectories into meaningful clusters, our typology serves as an entry point to understanding the association between childhood factors and subsequent pathways. The overall results highlight the conceptual importance of historical time and place (i.e. cohort and geographic location) and linked lives (i.e. parental resources and family structure) to the development of family formation events. Although the life course theory increasingly appears in efforts to understand fertility in Western cultures, the application of this concept in East Asia appears less frequently. The preliminary findings of this study support the idea that considering life course circumstances and context could enhance our understanding of trends and patterns in marriage and fertility in Korea. Further life course research on fertility may unearth clues for programmes that aim to improve the reproductive agency of women and ameliorate social inequalities in family formation.

Appendix A

Table A1 Calinski-Harabasz and Duda-Hart indices for different cluster solutions
Fig. A1
figure a

Dendrogram. Note The dendrogram plots the distance level at which the mergers of clusters occur