Intergenerational mobility measurement with latent transition matrices

We propose a multivariate approach for the estimation of intergenerational transition matrices. Our methodology is grounded on the assumption that individuals’ social status is unobservable and must be estimated. In this framework, parents and offspring are clustered on the basis of the observed levels of income and occupational categories, thus avoiding any discretionary rule in the definition of class boundaries. The resulting transition matrix is a function of the posterior probabilities of parents and young adults of belonging to each class. Estimation is carried out via maximum likelihood by means of an expectation-maximization algorithm. We illustrate the proposed method using National Longitudinal Survey Data from the United States in the period 1978-2006.


Introduction
As income is often conceived as the main indicator of social status, economic studies on intergenerational mobility mostly focus on income or earnings mobility (Black and Devereux 2011;Björklund and Jäntti 2020). When dealing with the measurement of intergenerational mobility, a number of different methods are available to the researcher, even though there is no general agreement in the literature about the best way of measuring this phenomenon (Jäntti and Jenkins 2013). Intergenerational correlation and elasticity are the most popular measures, since the seminal papers by Solon (1992) and Zimmerman (1992). A limitation of these methods is that they are not informative about nonlinearities in mobility. Different methods have been adopted to overcome such limitation, including the Spearman (or rank) correlation (Chetty et al. 2014;Corak et al. 2014), transition matrices (Corak and Heisz 1999;O'Neill et al. 2007;Bhattacharya and Mazumder 2011), polynomial regressions (Bratsberg et al. 2007), and quantile regression approaches (Eide and Showalter 1999;Grawe 2004).
Transition matrices represent a useful tool to document the movement of individuals across different classes. The use of transition matrices offers a detailed depiction of intergenerational mobility and facilitates the interpretation of the phenomenon: mobility measured through transition matrices may be viewed as a reranking or positional phenomenon, in which individuals switch classes across generations. A key issue with this methodology consists in defining the thresholds between classes. Formby et al. (2004) distinguish between quantile and size transition matrices. In quantile transition matrices, class boundaries are set to have the total number of sample units equally divided across classes, whereas thresholds are exogenously set in size transition matrices. Transition matrices are also adopted in quantitative sociology, where they are known as class mobility tables and they are based on occupational classifications. An important contribution in this literature is provided by Erikson and Goldthorpe (1992), who developed a class schema and described the movement among classes through mobility tables. Similarly, Long and Ferrie (2013) use transition matrices to assess intergenerational occupational mobility in Britain and the US throughout the twentieth century.
We propose a model for the measurement of intergenerational class mobility in a multivariate framework, where individual social status is modeled as a discrete latent variable. Other recent works study intergenerational mobility adopting a latent variable approach in which multiple observable outcomes are considered noisy measures of an underlying latent variable, e.g. social status for Vosters and Nybom (2017), Vosters (2018) and Adermon et al. (2021), human capital for Fletcher and Han (2019), health status for Halliday et al. (2020). The use of multivariate statistical methods is becoming increasingly popular not only in mobility but more generally in studies of welfare economics. 1 Yet, to the best of our knowledge, the approaches developed in the literature of intergenerational mobility focus on the case of latent social status conceived as a continuous variable. While this choice is appropriate if the main interest lies in estimating intergenerational correlation coefficients, a discrete formulation of social status is more suited if the researcher's aim is to identify a finite number of classes in the population, and to study transition patterns among classes.
We fill this gap in the literature by proposing a model that adopts a latent class formulation, where individuals from the same family (e.g., father and adult son) are allowed to belong to different social classes in the two generations in which they are observed. The main feature of the model is represented by the matrix collecting transition probabilities, which explains the movement among classes across the two generations. This model deals effectively with two crucial issues in the methodological literature on intergenerational mobility: (i) accounting for multiple dimensions of individuals' social status, and (ii) establishing class boundaries when measuring mobility through transition matrices.
The main advantage of the model is that it allows data-driven clustering of subjects in different classes, thus avoiding any discretionary criterion in setting class boundaries. The problem of class boundaries when using transition matrices is solved endogenously with the estimation algorithm, which classifies individuals based on the posterior probabilities of belonging to each class rather than setting arbitrary thresholds. By estimating the model, it is possible to assign each individual to a class, discussing the main features of the class itself and computing mobility measures based on the estimated transition matrix. Treating the individual status as a latent variable, we incorporate the popular concept that social status is not observable but may be only proxied. As proxies for individuals' social status, we use income and occupation, in line with the previous literature on intergenerational mobility in economics and sociology.
The empirical application is based on US data from the National Longitudinal Survey of Youth (NLSY79). We further divide the sample in two subgroups based on race and ethnicity of adult sons, providing a comparative analysis of mobility among the two groups using absolute mobility measures and overall mobility indices.
The paper is organized as follows. In Section 2, we formulate the proposed specification. In Section 3, we present the empirical application. This section includes the data description, the discussion of the sample selection rules as well as the results. Concluding remarks are provided in Section 4.

A mixed-type data model for intergenerational mobility
To develop a model for intergenerational mobility, we rely on a latent class formulation (Lazarsfeld and Henry 1969;Goodman 1974) where individuals from the same family are allowed to belong to different social classes. Formally, our model belongs to the class of latent Markov models (LMMs; Bartolucci et al. 2013), also known as latent transition analysis models (Collins and Lanza 2010). In the following, we provide an overview on LMMs and we highlight the connection to related models adopted in the mobility literature. Then, we discuss the model assumption and the estimation procedure.

Overview on LMMs and relation to existing literature
LMMs represent a generalization of latent class models to longitudinal data, in which the units of observation are allowed to move between latent classes. They are designed to model class composition, as well as the incidence of transitions over time in latent class membership. In other words, a typical analysis of LMMs based on multivariate longitudinal data consists of finding clusters of units and studying the transition between these clusters. These models assume the existence of a latent process affecting the distribution of the observable (manifest) variables. The latent process follows a first-order Markov chain with a finite number of states, typically referred to as latent states. The manifest variables are assumed to be independent, conditional on the latent process: if we knew the latent state of an individual, then the realization of a manifest variable would not help in predicting the other manifest variables, as the latent process is the only explanatory factor of the observables. A thorough description of the assumptions is provided in the next section, in connection with the proposed formulation.
The use of LMMs to study intergenerational mobility presents distinct features with respect to standard applications of this type of models. The first peculiarity is that the number of time occasions is limited. In studies of intergenerational mobility, the number of time occasions is usually equal to two, coinciding with parental and offspring generations (studies that analyze more than two generations of individuals are rare; see for instance Adermon et al. 2021 andModalsli 2021). In typical applications of LMMs, the number of time periods is usually greater than or equal to three. 2 The second peculiarity is that the individuals are different in the two time occasions (e.g., fathers and adult sons). We propose a formulation with a grouping variable for each generation to better explicate this feature. Additionally, the proposed model allows the number of classes to change across the two generations, as in Anderson et al. (2019).
Similar frameworks combining several status proxies have been adopted in the mobility literature. Vosters and Nybom (2017) and Vosters (2018) adopt the methodology developed by Lubotsky and Wittenberg (2006) to provide estimates of intergenerational status persistence in Sweden and US. This method provides an estimate of intergenerational persistence by linearly combining the observable proxies, in a regression framework in which all the proxies are included simultaneously. In this framework, the latent factor is included as explanatory variable in a linear model where the dependent variable is son's log earnings. Differently from our approach, multidimensionality only enters into the right-hand side of the model. In general, the choice of the suitable latent variable model depends on the given context and application. Most of the times, latent status is modeled as a continuous latent variable. 3 While this specification is often conveniently adopted (e.g., for income and self-reported health status, which are typically continuous and ordinal, respectively), it would still require setting exogenous thresholds on the latent scores to build a transition matrix. This choice is not particularly suited in the present context, as the main interest lies in partitioning the population into a finite number of groups (social classes) and studying transition patterns. In such case, a discrete variable with a finite number of states seems the most appropriate solution. Moreover, as far as the analysis is conducted assuming the number of latent classes as unknown, even if the underlying latent variable is actually continuous, a discrete approximation is known to work well. This is widely known in the literature on finite mixture models and related semi-parametric maximum likelihood estimation; see, for instance, Lindsay et al. (1991), Mroz (1999), Cameron and Heckman (2001) and Arcidiacono and Jones (2003).

Main formulation
Let T = 2 be the number of time periods, and suppose that our units of observation are n parent-adult child pairs c = 1, . . . , n. For each generation t, we observe a realization y ct of the bivariate vector of manifest variables Y ct = (Y 1ct , Y 2ct ). Assume that Y 1ct is a continuous variable (e.g., income), and Y 2ct is categorical (e.g., the type of occupation). DefineỸ c as the vector of the manifest variables stacked along the time dimension and denote its realization asỹ c . In other words, the manifest variables of pair c in period t = 1 refer to some measure of income and occupation of the parents' generation, while in t = 2 the same variables refer to adult children. Suppose the existence of two discrete latent grouping variables capturing social status, F c and S c for the first and second generation respectively, which are collected in the random vector U c = (F c , S c ). We allow the number Fig. 1 Path diagram of the model of states (also named latent classes) to be different in the two generations, that is, to be timespecific: k t , for t = 1, 2. As mentioned, the main interest of the model lies in the distribution of the latent process denoted by U c . We refer to the initial and transition probabilities as: where u = (v, s) denotes a realization of the random vector U c . With respect to the measurement component, we make use of the local independence assumption, also known as the contemporaneous independence assumption. This assumption implies: where we use f (t) as a generic symbol for a density or probability function. Note that the conditional distribution of each variable Y rct given the latent state is allowed to vary over time; that is, it is characterized by time-specific parameters, as indicated by the superscript (t). A path diagram of the proposed model is illustrated in Fig. 1, where unobservable random variables are indicated by circles and observables are indicated by squares. We assume that the components of the continuous variable are Gaussian functions with state-specific means and variances: 4 for v = 1, . . . , k 1 , s = 1, . . . , k 2 . Let j be the number of categories of Y 2ct , t = 1, 2. The number of categories, thus, is fixed across generations. Denote the conditional probabilities of Y 2ct as: Again, the measurement model's parameters are generation-specific. For instance, μ 4 and σ 4 represent the mean and standard deviation of the income density associated with the fourth class of the first generation, while η (2) 3|1 is the probability of being in the third occupational category for a second-generation individual belonging to the first class.
Making use of Eqs. 1 and 2, we obtain the marginal distribution of the manifest variables (the manifest distribution) by marginalizing over the distribution of the latent process: Thus, the estimation of fỸ c (ỹ c ) requires summing over all the possible k 1 × k 2 configurations of the vector u. This is done by resorting to the forward-backward recursions within the expectation-maximization (EM) algorithm (Baum et al. 1970;Welch 2003) developed in the Hidden Markov models literature and implemented through suitable matrix notation (Bartolucci 2006;Zucchini and MacDonald 2009).
A straightforward interpretation of such a model arises. In particular, the following (and possibly rectangular) matrix: that is, the matrix collecting the transition probabilities, may be interpreted as a matrix of intergenerational mobility, where social statuses in both time periods are measured based on the observed level of the manifest variables. The proposed model formulation allows to separately analyze the two main mobility concepts, namely, structural and exchange mobility. These two notions, though inherited from sociology, are frequently adopted in the economic literature (Markandya 1982;Schluter and Van de Gaer 2011). Structural mobility is captured by changes in the classes' expectations of the manifest variables, and by the change in the number of classes over generations. Therefore, the transition matrix depicts exchange mobility entailing the reranking mechanism at work from one generation to another. Furthermore, the initial probabilities π v , v = 1, . . . , k 1 , represent the sizes of the k 1 social latent classes of the first generation, while the sizes of the second generation's k 2 classes may be retrieved as for s = 1, . . . , k 2 . Interestingly, comparisons of exchange mobility between two distinct societies may be obtained by the differences in their transition matrices. If the number of classes were to differ between countries, testing the difference in each single transition probability would be unfeasible. However, mobility indices summarizing overall transition patterns into a scalar are available in the literature (e.g., Anderson 2018). Some caution would be required if state-dependent distributions were to differ significantly. Nonetheless, the study of class definitions and compositions is relevant itself, in a comparative perspective, to highlight differences in social classes among countries. The same considerations hold for a given country observed over time.

Including categorical covariates in the latent model
The model allows in a simple way to encompass the role of explanatory variables in the formation of social classes and the dynamics among them. Suppose that for each pair c we also observe a categorical variable, G c , which identifies L subgroups in the sample of parents-child pairs, that is, G c = 1, . . . , L. In the United States, for instance, one of the main covariates that must be taken into account is given by the individuals' ethnicity and race. We can model the initial and transition probabilities conditional on belonging to group g, namely, The δ g s parameters may be retrieved accordingly. The same set up stems from the inclusion of two or more categorical covariates in the model, where the L groups are identified by the interaction of all the covariates.

Maximum likelihood estimation
Under the main formulation presented in Section 2.2.1, assuming independence of n sample units, the log-likelihood of the model may be expressed in the following way: where θ is the set of all the model parameters. The number of free parameters is equal to k 1 − 1 in the initial distribution, k 2 (k 1 − 1) in the transition distribution, 5 2(k 1 + k 2 ) for the Gaussian densities, and (k 1 + k 2 )(c − 1) for the categorical responses. The function in Eq. 4 can be maximized by means of the EM algorithm (Dempster et al. 1977). The EM algorithm treats the individual latent states as missing data and finds the maximum likelihood estimates of the parameters in Eq. 4 by maximizing the complete data log-likelihood (CDLL). 5 These numbers increase to L(k 1 − 1) and Lk 2 (k 1 − 1) if we include G c in the model, and parameters (π g v , π g s|v ) replace (π v , π s|v ) in the log-likelihood function.
A complete description of the estimation algorithm is provided in Appendix A in supplementary material. In Appendix B in supplementary material we provide the modifications to the EM algorithm required for the extended setting outlined in Section 2.2.2. A crucial issue is the choice of the number of latent classes. A common approach to determine the number of states is based on the Bayesian's Information Criterion (Schwarz 1978, BIC;), which is expressed as BIC = −2ˆ + p log(n), where p is the number of nonredundant parameters. In words, the BIC is a penalized version of the maximum loglikelihood where the penalty term depends on the model complexity (i.e., the number of parameters), and it is known to perform well in the context of LMMs (e.g., Bartolucci and Farcomeni 2009). The model to be selected is the one with the smallest BIC.

Data
We use data from the National Longitudinal Survey of Youth 1979 cohort (NLSY79). The NLSY is a longitudinal survey that follows the lives of a sample of 12,686 young American men and women who were 14-to 22-year-old when first interviewed in 1979. Respondents were interviewed annually until 1994, and every other year since 1996. The project provides data available up to 2014 (Round 26), and the questions refer to the years before the interviews. During the first waves, for respondents living with their parents, a section of the survey (Household Interview) was addressed directly to the respondents' parents and includes information on family income at the parental level. In 1979, respondents were further asked to report parents' occupations. Given the long time period spanned by the survey, the NLSY dataset is adopted in several studies on intergenerational mobility (Jäntti et al. 2006;Bhattacharya and Mazumder 2011;Mazumder 2014). As standard in intergenerational mobility studies, the units of observation are pairs of individuals linked across generations. Following Bhattacharya and Mazumder (2011), in our analysis we exclude daughters, to avoid labor force participation issues, and focus on father-son pairs. 6 The role of mothers as contributors to the social background is captured by the use of family income rather than fathers' earnings. Our final sample, thus, consists of 1722 men (sons) who were living with their parents in 1979.
As a measure of income, we use total net family income, that is, the sum of a number of income values for household members related to the survey respondent by blood or marriage. A limitation of the income variable is the top coding of the upper tail of each year's income distribution. 7 Given that the true income levels are not always observed, the standard EM algorithm may deliver undesirable parameter estimates, and it may sometimes fail to converge (Atkinson 1992). Accounting for censored observations would require some modifications to the model log-likelihood, and thus, to the estimation algorithm (McLachlan and Jones 1988;Atkinson 1992;Lee and Scott 2012). However, we resort to an alternative strategy. In particular, taking time averages as a measure of individual income reduces the problem, as we use information from several years for each individual. The advantages of using time averages as proxies of permanent incomes have been extensively discussed by Solon (1992) and Zimmerman (1992), and it is a common practice in empirical studies on intergenerational mobility measurement.
Thus, as a measure of parental income, we take the first three waves of the survey, 1979-1981, and we average the total net family income. To measure sons' incomes, we average the same variable over five consecutive waves (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006), when respondents were 37 to 45 years old. For both generations, we use any available year of data, and we include in the sample only pairs in which sons have at least two valid income records. We convert all the income variables into constant 2000 US dollars using the Organisation for Economic Co-operation and Development (OECD) consumer price index.
To have comparable occupational statuses across generations, we map the original variable labels (Census 1970 occupational codes) into a new variable that splits the occupations into three broad skill-based categories, and we refer to these categories as high-, mediumand low-skilled occupations. 8 We exclude unemployed individuals from the sample. As stated above, data on parental employment is available only for 1979. We use father's occupation whenever reported; 9 otherwise, we drop the father-son pair from the sample. We follow the same procedure for the respondents' generation in each of the five consecutive waves from 1998 to 2006. The occupational codes (Census 1970 to 2000 and Census 2000 from 2002 on) refer to the respondents' main job in the year before the interview. Regarding the long-term occupational status, Mazumder and Acosta (2015) point out that it is a more salient issue today than in the past, in particular due to a higher degree of occupational switching during the life course. We selected the most recurrent category among the nine ISCO-88 categories. If an individual had the same number of observations for two different categories, we selected the most recent one. Then, the ISCO categories were reduced to the afore-mentioned skill-based three-group categorization. Table 1 shows the descriptive statistics for the variables of the analysis. The gap in the average family income in the two generations is due to the structural growth of the US economy in the period of analysis as well as the role of the two different top-coding algorithms adopted in the two time periods. The lower average age of sons, usually identified as the cause of life-cycle bias in the estimates of intergenerational elasticities, is in line with the previous studies of mobility using NLSY79 data. Moreover, the almost 40-years-old average sample of sons fulfills the age requirements to reduce the bias at the minimum, given that the seminal work by Haider and Solon (2006) sets the optimal age between 35 and 45 years old for the United States. The evolution of the occupational 8 This simplified categorization is obtained from the International Standard Classification of Occupations (ISCO-88). Low-skilled occupations comprise i) plant and machine operators and ii) assemblers and elementary occupations; medium-skilled occupations comprise iii) clerks, iv) service workers and shop and market sales workers, v) skilled agricultural and fishery workers and vi) craft and related trades workers; and high-skilled occupations comprise vii) legislators, senior officials and managers, professionals and viii) technicians and associate professionals. As a robustness check, we estimate the model using the nine major ISCO categories. Given the sparseness of these categories, the information matrix is not invertible due to some estimated probabilities lying at the boundary of the parameter space. Still, results in terms of mobility remain substantially unchanged. 9 We use family income and fathers' occupation to have comparable variables with the sons' generation. In a previous version of this paper, for those respondents whose father's occupation is not available, we used mother's occupation. Results remained essentially unchanged. distributions in the two generations reflects changes that occurred in the labor market structure in the years between the 1980s and the 2000s with the increase in high-skilled jobs to the detriment of low-skilled jobs.

Model fitting
In this section we present the results of the empirical application. We recall that the assumptions of our model can be summarized as follows: s). We discuss the selection of the number of latent classes and we comment on class composition, state-dependent distributions and transition patterns. Finally, we provide an analysis of mobility comparison among racial and ethnic groups, where the latent process is assumed to be group-specific with probability distribution P (U c = u|G c = g) = π g v × π g s|v .

Main formulation
To select the number of latent states, we run the model for each possible combination of (k 1 , k 2 ), with k 1 = {2, 3, 4} and k 2 = {2, 3, 4, 5}. According to the BIC, a suitable number of latent states is k 1 = 3 and k 2 = 4, as shown in Table 2. Due to the problem of possible multimodality of the likelihood function, we estimate the model 15 times by randomly  Table 3 reports the estimated class sizes (π v and δ s , v = 1, . . . , 3, s = 1, . . . , 4) along with the means and standard deviations of the income distributions (μ v , δ v , ξ s and τ s ), whereas Fig. 2 plots the kernel density estimates of the income distributions with the scaled components' densities. We recall that LMMs are, in general, globally identified up to a switching of the latent states. Therefore, we identify the classes based on the ordering of the Gaussian means at each time period.
We emphasize that the latent states have different interpretations in the two time periods. This is particularly true when the number of classes varies over generations as in our application. 10 In this way, we are able to account for structural changes in the economy, that is, changes in the income distribution and in the labor market. In the second generation, the top class is composed of "super rich" individuals as the average of their component is well above the overall average, and the high variance of the same component captures the positive skewness of the observed distribution.
In Fig. 3, we plot for each generation the estimated probabilities of belonging to each occupational category given each latent state (η (1) y|v and η (2) y|s ). Bottom classes are associated with a high probability of being employed in a low-skilled occupation. In contrast, top classes are associated with a high probability of being employed in a high-skilled occupation. In the second generation, the third and fourth classes' occupational composition is similar, as both classes are mainly composed of high-skilled workers. Notes: Panel A shows the estimated size (π v ), the mean (μ v ) and the standard deviation (σ v ) of the Gaussian density of each class v in thousand dollars, for fathers. Panel B shows the estimated size (δ s ), the mean (ξ s ) and the standard deviation (τ s ) of the Gaussian density of each class s in thousand dollars, for adult sons. Standard errors (in parentheses) are computed with a non-parametric bootstrap with 100 replications As noted, the fourth class in the second generation, which is characterized by higher family income, emerges as an "extension" of the third class. The estimated transition matrix is presented in Table 4.  There are several ways to analyze transition patterns in , starting from absolute mobility measures. 11 We look at upward mobility as the probability of reaching the top class for sons born to fathers in the lowest class, which is a measure of great normative interest for equality of opportunity reasons. This value corresponds to the value at the top-right corner of the transition matrix, and in our case it is equal to 0.013 (i.e., 1.3 percentage points). On the other hand, downward mobility is represented by the probability of moving to the bottom class for sons born to fathers in the top class. This value corresponds to the value at the lower-left corner of the transition matrix, and it is equal to 3.6 percentage points. These results provide evidence of a low level of mobility. According to the estimates, those who have fathers in the first class are unlikely to reach the higher classes. The opposite happens for those whose fathers are in the top class. The degree of persistence at the top and at the bottom (i.e. the probability, for those who have fathers in the first and bottom class, to remain in the same class) provides additional significant information: The persistence at the

Race and ethnicity: mobility comparison
The ethnic and racial differences in the income distribution in the US have well-known foundations originating in the last centuries. These inequalities may seriously hamper the process of development if they are persistent over generations. Estimates of intergenerational mobility by ethnic group (or race) provide insights into whether racial differences in the United States are likely to be eliminated, and, if so, how long it might take. Other studies have analyzed the role that racial or ethnic origins play in terms of intergenerational mobility; see for example Mazumder (2014) and Bhattacharya and Mazumder (2011) who use NLSY data adopting different methodologies. 13 The authors conclude that, in recent decades, black individuals have experienced substantially less (more) upward (downward) intergenerational mobility than white individuals.
12 This result may be induced by the different class structures in the two generations. However, as shown in the supplementary material (Section 1), the persistence at the bottom is higher than the persistence at the top also when looking to the (4 × 4) transition matrix. 13 They adopt transition probabilities of relative income status and measures of directional rank mobility. Significance levels: *** p < .01 ** p < .05 * p < .10 Notes: The table shows the average posterior probability of belonging to each latent class, conditional on sons' race/ethnicity. Row 3 of panels a and b show the differences in the probability of belonging to each class between white and non-white individuals African-American and Hispanic individuals represent about 46% of our sample (Table 1). In the following, we partition the sample into L = 2 groups, based on sons' racial or ethnic origins: white (w) and non-white (nw) individuals, where non-whites are Afro-american and Hispanic individuals. The choice to consider the two minority groups together is driven by estimation issues related to the restricted sample size. The empirical model allows to study class composition and transition paths for the two groups. Now, π g v may be interpreted as the probability of belonging to class v for a father whose son belongs to group g, with g = w, nw. Similarly, π g s|v is now the probability that a son from group g belongs to class s conditional on his father's class being v.
A substantial difference exists in terms of the class compositions. In particular, most of non-white sons belong, on average, to the bottom classes (Table 5,

panel b).
For instance, the probability of belonging to the bottom class is equal to 0.221 for white individuals, which is approximately 26 percentage points lower than the same probability for non-white individuals. This difference is statistically significant at the 1% level. On the other hand, white individuals have higher probabilities of belonging to class two, three and four (12.7, 9.8 and 3.6 percentage points, respectively). The same results hold for their fathers (panel a, third row).
Differences in transition patterns are less clear cut. Table 6 shows the transition probabilities for white individuals, while Table 7 displays the differences in these parameters among white and non-white individuals. For instance, there is a significant difference in the persistence in the lowest class. This probability is equal to 0.595 for white individuals, and it is almost 10 percentage points lower than the corresponding probability for nonwhites (Table 7, top-left corner). White individuals also show a lower propensity to transition from the second to the lowest class (-0.062), whereas their probability of moving from the lowest to the second class is higher (0.068). On the other hand, persistence at the top is similar. It corresponds to the propensity of moving from the third to the fourth class, and it is equal to 0.252 for whites and to 0.239 for non-white individuals, where this difference is not statistically significant (Table 7, bottom-right corner). Upward mobility of white individuals is almost twice the upward mobility of non-white individuals, whereas downward mobility rate of non-white individuals is more than twice that of white individuals, although these values are low and not statistically different (top-right and bottom-left corners of Tables 6 and 7, respectively).
To provide formal evidence on the overall transition patterns, we introduce a set of summary measures of mobility, and we test the differences in these measures computed for the two groups. A mobility index is a function M( ) mapping into a scalar (Formby et al.  (Fields and Ok 1996;Checchi and Dardanoni 2002). We use the index developed by Anderson (2018), defined as: where max(π s|· ) and min(π s|· ) are the operators returning the maximum and minimum value, respectively, of the sth column of . The A index satisfies the normalization, immobility and perfect mobility axioms (Shorrocks 1978). 15 We further adopt two modified versions of the Bartholomew index (Bartholomew 1982), B 1 and B 2 . The idea behind the original index is that each class transition is weighted by the number of crossed boundaries. The two modified versions are distinguished depending on the weights attached to each transition, given that in the context of different numbers of classes over generations the interpretation of the class transition is not straightforward. In B 1 , the weights are given by the number of crossed boundaries. In B 2 , the weights w sv are set equal to one for transitions to subsequent classes, equal to two for 2-step transitions, and so on. The indices are defined as follows: π v π s|v |s − v|, π v π s|v w sv .
Similar to the standard Bartholomew index, these modified versions satisfy the normalization and the immobility axioms, but not the perfect mobility axiom. Nonetheless, their interpretation is straightforward. They are bounded between 0 and 1, and the higher is their values, the greater is the mobility level of the matrix. Table 8 reports the mobility indices computed on the estimated 3×4 and matrices of the two groups. The point estimates suggest that the overall degree of mobility is higher for white individuals, and this holds true for all the considered indices. In particular, we have that the Anderson index equals 0.492 in the white group and 0.476 in the non-white group. However, these differences are not statistically different from zero (the last column of the table illustrates the test results). Overall, these results provide: (i) strong evidence of a racial/ethnic gap in class compositions, and (ii) mild evidence of a racial/ethnic gap in transition patterns.

Conclusions
This paper aims at proposing a new approach based on a multivariate framework to the study of intergenerational mobility. The adopted model may be cast in the class of latent Significance levels: *** p < .01 ** p < .05 * p < .10 Notes: The table shows the Anderson (A) and modified Bartholomew (B 1 ,B 2 ) indices computed on the model's estimated transition matrices for the whole sample and the two racial (ethnic) groups. The last column (Group difference) reports the differences of the indices computed on the group-specific matrices. Standard errors (in parentheses) are computed with non-parametric bootstrap with 100 replications Markov models, where an individual status is treated as a discrete latent variable with a finite number of states. Individuals are aggregated in different classes and a latent transition matrix between the two generations' classes is estimated. The peculiarity of this model is that it allows to analyze a multidimensional status and to avoid the issue of fixing class boundaries. Another key feature of the model is that it allows the number of classes to differ in the two time periods, potentially delivering a rectangular transition matrix. The results from the empirical application based on the National Longitudinal Survey Data show that the level of upward and downward mobility in the US is low. Moreover, nonwhite individuals show a higher persistence at the bottom of the distribution. We further look to the more comprehensive mobility indices encompassing all the movements among latent classes. We find that overall mobility is not statistically different among the two groups. However, a significant discrepancy is found in terms of class belonging.
Given its characteristics, the model may be also applied to compare mobility across societies. The resulting latent transition matrices may be adopted in comparative studies even though the number of latent classes may differ between countries. While it would be impossible to test the difference in each single transition probability, due to non-comparable class structures, comparable mobility indices are available in the literature. Future possible extensions of the model involve a more structured parametrization of initial and transition probabilities, with the inclusion of continuous covariates affecting the mobility process (e.g., years of schooling). Finally, future work might aim to adapt the model to study intragenerational mobility and the transition between social classes over the individual life-cycle. and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.