Introduction

Sorting into higher education entails a complex set of actions of both students and colleges (Grodsky & Jackson, 2009). Even so, one step that is especially influential in shaping where students ultimately enroll is the application process (Bowen et al., 2009; Holzman et al., 2020). Mirroring enrollment gaps by social background (Alon, 2009; Chetty et al., 2020; Reardon et al., 2012), a growing number of studies have documented application disparities across socioeconomic groups (An, 2010; Holland, 2014; Mullen & Goyette, 2019). Namely, high-SES students are significantly more likely to apply to selective institutions compared to their low-SES counterparts (Hoxby & Avery, 2012; Mullen & Goyette, 2019; Radford, 2013). While it is well documented that these gaps exist, we know less about why students vary in their application behavior.

Comparing two well-developed sociological theories linking social origins to educational stratification, we undertake a quantitative investigation of how and why high- and low-SES students differ in their college application behavior. First, applying the rational action model developed within Boudon’s (1974) framework of inequality of educational opportunity (IEO), we analyze how performance differentials (or “primary effects”) and choice differentials (or “secondary effects”) vary by social class and what role they play in explaining SES-based gaps in college applications. Second, building upon the status attainment model (Sewell et al., 1969), we examine how unequal educational expectations contribute to class-based disparities in where students apply to college. Importantly, we measure educational expectations in terms of not only the level of education students plan to obtain (i.e., how far) (Mullen & Goyette, 2019), but also the type of college students plan to attend (i.e., where) (Gerber & Cheung, 2008). Finally, we include the number of applications submitted as a strategy differentially employed by students of varying socioeconomic backgrounds to improve their chances of admission to selective colleges (Radford, 2013).

We utilize data from the most recent nationally representative sample of high school students in the United States—the High School Longitudinal Study of 2009 (HSLS:09)—to examine: (1) how performance differentials, choice differentials, unequal educational expectations, and the number of applications submitted contribute to the class-based gap in college application selectivity, and (2) whether the link between expectations and applications differs by socioeconomic background. While past work in this area has tended to focus on only the highest-performing students (Hoxby & Avery, 2012; Lor, 2023; Radford, 2013), we take a broader approach and examine the application behavior of all high school graduates. Moreover, in this analysis, we do not impose restrictions in terms of which students we deem a “match” for a particular type of institution (Mullen & Goyette, 2019; Roderick et al., 2011; Roksa & Deutschlander, 2018; Smith et al., 2013). For example, because colleges and universities generally rely upon a host of factors when making admissions offers—not just academic metrics—we prefer an open approach to college applications rather than restrict our analysis to an overly narrow set of measures.

To preview our results, we can explain 85% of the gap in college application selectivity between students in the top and bottom SES quintiles. We estimate that 60% of this explained portion is due to rational action mechanisms (e.g., GPA, standardized test scores, etc.), while 35% is due to status attainment mechanisms (i.e., educational expectations and number of applications). In turn, we reveal a significant interaction between SES and type of expectations on application selectivity. For instance, we estimate that average-performing, low-SES students with the highest expectations have a 31% predicted probability of applying to selective colleges, compared to 46% among their high-SES counterparts.Footnote 1 Overall, we believe this study provides the most comprehensive quantitative investigation to date into the drivers of the SES-based gap in college application selectivity.

Background

Over the past few decades, a growing literature has drawn attention to SES-based gaps in where students apply to college. Prior work analyzing nationally representative datasets of students in the U.S. has consistently shown differential rates of application to selective colleges by socioeconomic background. For example, using data from the National Education Longitudinal Study of 1988, Cabrera and Nasa (2000a) highlight significant gaps in the probability of applying to four-year colleges by SES quartile. Likewise, drawing upon the Education Longitudinal Study of 2002, several studies have shown that higher-SES students are more likely to apply to selective colleges compared to their lower-SES counterparts (An, 2010; Mullen & Goyette, 2019; Roksa & Deutschlander, 2018). Finally, recent work using the High School Longitudinal Study of 2009, has documented significant gaps in the selectivity of college applications between students in the top and bottom income quartiles (Holzman et al., 2020). While these and other studies have provided strong evidence that SES shapes stratified college applications, we know relatively less about why. Comparing two sociological models of educational stratification, we aim to unpack the various mechanisms that give rise to these unequal patterns.

Rational Action Model of Stratified Applications

One potential sociological framework for understanding the SES-based gap in college applications is the rational action model (Grodsky & Jackson, 2009) of educational inequality developed most in-depth by Boudon (1974). According to this perspective, social origins shape unequal educational outcomes via performance differentials (or primary effects) and choice differentials (or secondary effects). Whereas the former mechanism draws attention to the role of the achievement gap in linking socioeconomic background with educational stratification, the latter mechanism looks at variation net of (or conditioning on) performance disparities (Jackson 2013). While we know that both mechanisms likely matter for where students apply, relatively little work has estimated the relative contribution of each mechanism to stratified college applications by class background.

First, according to primary effects of the rational action model, students from varying socioeconomic backgrounds may sort into distinct pools of applicants by aligning the selectivity of their applications with their performance levels—as measured by grades and standardized test scores. This would occur if students only applied to schools that seemed to “match” their high school academic credentials (Hoxby & Avery, 2012; Mullen & Goyette, 2019). In general, past research has shown that academic achievement measures are associated with selective applications (An, 2010) and that gaps in academic qualifications help to explain stratified college destinations by family income (Bastedo & Jaquette, 2011; Holzman et al., 2020). However, it remains unclear how much performance differentials contribute to the SES-based gap in college applications.

Second, according to secondary effects of the rational action model, there are reasons to suspect that even after accounting for performance disparities, students may still apply to different types of colleges based on their class origins. This could arise for a variety of reasons. For example, students from varying social backgrounds may have differential access to information about college and ways to finance their education (Hoxby & Avery, 2012; Hoxby & Turner, 2013; McDonough, 1997; Robinson & Roksa, 2016). They may also face differing constraints in terms of the factors they consider most important when making their postsecondary decisions (Hossler & Vesper, 1999; Mullen, 2011; Perna, 2006; Roksa & Deutschlander, 2018). Indeed, qualitative research of high-performing students has uncovered various ways in which socioeconomic background continues to shape the types of colleges students apply to, even among this very select group (Lor, 2023; Radford, 2013).

Status Attainment Model of Stratified Applications

A second sociological framework for understanding the SES-based gap in college applications is the status attainment model (Grodsky & Jackson, 2009). According to this perspective, differential expectations are a critical mechanism linking social origins with stratified educational outcomes. Indeed, pioneering work among sociologists within the status attainment tradition demonstrated the role of expectations in linking socioeconomic background with unequal educational destinations (Haller & Portes, 1973; Sewell et al., 1969). In turn, recent work has revealed that differential educational expectations help to explain SES-based gaps in where students apply to college (Mullen & Goyette, 2019).

However, until now most research has been limited to measures of educational expectations in terms of how far students plan to go in school (i.e., level). We argue that in the current era of increasing access, differentiation, and competition within higher education (Alon, 2009; Bastedo & Jaquette, 2011; Mullen, 2011; Taylor & Cantwell, 2019), student expectations for where they will attend college (i.e., type) also matters. Namely, a substantial body of work provides reasons to suspect that more advantaged students may develop selective expectations for certain kinds of educational credentials that can facilitate their access to high-status positions in society (Goyette, 2008; Lucas, 2001; Mullen, 2011; Reay et al., 2005). These heightened expectations could arise due to more frequent and active discussions surrounding college for high-SES students that take place at home, among peers, and with institutional agents such as high school counselors or private college consultants (Cabrera & La Nasa, 2000b; McDonough, 1997; Perna & Titus, 2005; Roksa & Deutschlander, 2018). The culmination of varying family and schooling environments may lead high-SES students to develop a sense of “entitlement” for a particular type of collegiate education (Ford & Thompson, 2016; McDonough, 1997:9; Roderick et al., 2011).

Additionally, due to increasing competition within higher education, we also examine the extent to which the number of applications students submit contributes to the SES-based gap in college application selectivity. For instance, students may submit more applications as a strategy to improve the likelihood that they are accepted to at least one selective college (Roderick et al., 2011). Not surprisingly, advantaged students are better positioned to handle the material costs of submitting applications when fee waivers are not available or widely known (Hoxby & Turner, 2013). Indeed, past work has shown that higher-SES students submit more college applications, on average (Mullen & Goyette, 2019; Radford, 2013), and that this practice helps to explain class-based disparities in where students apply (Mullen & Goyette, 2019; Roksa & Deutschlander, 2018).

In sum, both the rational action model and the status attainment model highlight ways in which class background may shape stratified college applications. While the rational action model draws attention to the costs and benefits of applying to different schools, the status attainment model focuses on the socialized and taken-for-granted aspects of the college application process.

Differential “Returns” to College Plans

Finally, scholars of social stratification argue that differential return processes can also contribute to inequality between groups (Persell et al., 1992). For example, in our case, high-SES students may not only benefit from the types of colleges they plan to attend but also from greater return on their expectations. This could occur, for instance, if the association between the type of college students expected to attend, and the selectivity of their applications grew as SES increased.

There are a couple reasons to suspect that SES may moderate the relationship between expectations and college applications in this way. First, high-SES students may be more likely to enact their college plans due to greater familiarity with the concrete steps necessary to apply to top colleges (Morgan, 2018). Second, even among those who plan to attend a selective college, barriers during the actual application process may differentially impact students from varying class backgrounds. For example, recent work shows that the complexity of the college application process—especially the essay portion—may unequally lead low-SES students to start but not finish their application submissions (Odle & Magouirk, 2023).

Data and Methods

For this analysis, we draw upon the first follow-up wave (carried out in 2012) and the 2013 update of the High School Longitudinal Study of 2009 (HSLS:09). The data are a nationally representative sample of 9th graders from more than 900 high schools (public and private) collected by the National Center for Education Statistics (NCES) in 2009. As stated by NCES, the purpose of the data collection is to monitor the transition of a national sample of adolescents from their high school experiences through their postsecondary years.

The data were collected during the spring of their junior year, the spring of their senior year, and three years after high school graduation. One of the innovative features of the HSLS is an enhanced focus on the dynamics of educational decision-making, especially as it relates to college choice factors (Ingels et al., 2013). Thus, the HSLS is an ideal dataset for understanding how and why SES-based disparities arise among students during the application stage of the college-going process.

The analytic sample for this study is restricted to students who have acquired a high school degree or equivalent, since we are interested in looking at college application decisions. Missing values were addressed using multiple imputation with chained equations (MI) in Stata with ten imputed datasets. We use the dependent variable in the imputation equations, but all analyses are estimated using only non-missing values of the dependent variable (Von Hippel, 2007).

Measurement

College Application Selectivity

The dependent variable for this study is the selectivity of college applications as indicated by the highest institutional selectivity among top choice schools applied to or registered at (up to three available through the 2013 HSLS update data file).Footnote 2 Institutional selectivity is measured using the 2011–2012 admissions rate (including open admissions) collected by the NCES Integrated Postsecondary Education Data System (IPEDS). To ease interpretation, we reverse code the measure so that a higher value corresponds to greater selectivity (i.e., higher rejection rate). We acknowledge that much of the prior work in this area has drawn upon the Barron’s Competitiveness Index measure to examine institutional selectivity (An, 2010; Brewer et al., 1999; Holzman et al., 2020; Roksa & Deutschlander, 2018). Supplementary analyses utilizing this measure of selectivity produce strikingly similar results (see S1). Ultimately, we utilize the continuous IPEDS measure over the categorical Barron’s measure.

Socioeconomic Status (SES)

The main independent variable is a measure of the student’s socioeconomic background as indicated by their SES composite score. In the HSLS:09 dataset, this variable is a combined index of parental education level, parental occupational prestige score, and family income (see Ingels et al., 2013 for details). While the main analyses draw upon the continuous measure of SES, we also utilize SES quintiles to compare gaps in the outcome between those in the top 20% and bottom 20%.Footnote 3

Performance Differentials

To estimate how differential academic performance contributes to the SES-based gap in college applications, we draw upon several high school academic metrics. We include overall 11th-grade GPA, since this most accurately represents the period surrounding a students’ expectations during the spring of their junior year and will be what the students ultimately use to apply to colleges the following fall of their senior year. We also include a dichotomous indicator of AP coursework (yes = 1; no = 0)which is a measure of whether the student has taken any AP classes as of spring of their junior year—since this has been shown to matter for enrollment at top colleges (Espenshade & Radford, 2009). Additionally, we account for the students’ standardized test score since research shows this metric may influence how students gauge their own competitiveness for college admissions (Meyer, 1970). This measure indicates (1) the composite SAT score for students who took the test, (2) the converted equivalent using a respondents’ ACT score, or (3) the predicted equivalent using the standardized theta score (T-score) gathered as part of the HSLS-administered 11th-grade math assessment.

Choice Differentials

To estimate how choice differentials contribute to the SES-based gap in college applications, we include a host of variables that measure access to informational resources as well as college considerations. In terms of informational resources we include: attendance of a program at, or taken a tour of, a college campus (yes = 1; no = 0); searching for college options through the internet or through reading college guides (yes = 1; no = 0); talking with a high school counselor (yes = 1; no = 0); and talking about options with a counselor hired to prepare for college admission (yes = 1; no = 0). We also include a measure of taking a course to prepare for a college admission exam (yes = 1; no = 0).

In terms of college considerations, we account for the importance of several factors in the decision-making process. Distance measures whether being close to home is an important consideration (very important = 1; somewhat important/not at all important = 0). Differential perceptions of cost are measured in terms of the importance of cost of attendance (very important = 1; somewhat important/not at all important = 0). Academic quality/reputation measures the importance of institutional prestige (very important = 1; somewhat important/not at all important = 0), and family/friend recommendation (very important = 1; somewhat important/not at all important = 0) as well as family legacy (very important = 1; somewhat important/not at all important = 0) capture the familial influence component of the college choice decision. We also include the importance of whether the degree program of interest is offered at the school (very important = 1; somewhat important/not at all important = 0), the importance of graduate school placement (very important = 1; somewhat important/not at all important = 0), job placement (very important = 1; somewhat important/not at all important = 0), the importance of the opportunity to play school sports (very important = 1; somewhat important/not at all important = 0) and the perception of campus social life / school spirit (very important = 1; somewhat important/not at all important = 0).

Educational Expectations

We measure educational expectations in terms of both level and type. Educational expectations (level) is the conventional measure of expectations and indicates how far a student plans to go in school (less than high school = 1; high school = 2; some college = 3; associate’s or AA degree = 4; bachelor’s or BA degree = 5; graduate or professional degree = 6).Footnote 4Educational expectations (type) indicates the kind of college students plan to attend after high school based on 2011–2012 IPEDS admissions data. To create this measure, we use the student questionnaire from the first follow-up wave in 2012, when students were largely in their junior year. In a section of the interview about future plans and preparations, students were asked “What [school that provides occupational training/2-year college/4-year college/school or college] are you most likely to attend? (Please type in the full name. Do not use abbreviations).” Thus, this question captures a students’ college plans unconditioned by whether they actually applied or were admitted (Niu & Tienda, 2008). The responses were coded in the same way we coded the selectivity of college applications.Footnote 5

Number of Applications

We include a measure of the number of applications submitted, since this likely varies by SES (Radford, 2013) and has been shown to matter for the types of colleges students apply to (Mullen & Goyette, 2019; Roderick et al., 2011).

Control Variables

We account for other covariates at the individual level that could also influence the selectivity of college applications. Specifically, we include measures of race/ethnicity—(non-Hispanic white = reference group) with indicators for Hispanic or Latinx, Black/African American, Asian/Asian American, and multiracial/other—and gender (female = 1; male = 0).

At the school level, there are a host of factors that could also influence the selectivity of college applications among students from diverse backgrounds. Namely, the types of high schools that low- and high-SES students attend likely differ in terms of their ability to promote selective college applications (Roderick et al., 2011). Since we know students sort into different high schools based on their family income, we include aspects of the schooling context that past research has shown to differentially promote the transition to college for students from varying socioeconomic backgrounds (Turley, 2009). Specifically, we include measures of school control (public = 0; Catholic = 1; other private = 2), school type (regular = 0; charter school = 1; special program school = 2; other including career/technical/vocational/and alternative = 3), school urbanicity (city = 0; suburb = 1; town = 2; rural = 3), and geographic region (New England = 0; Middle Atlantic = 1; East North Central = 2; West North Central = 3; South Atlantic = 4; East South Central = 5; West South Central = 6; Mountain = 7; Pacific = 8). Lastly, we account for high school size using a measure of the total enrollment of students in grades 9–12, and the percent low-income which indicates the percent of the student body receiving free or reduced-price lunch.

Analytic Strategy

To examine how and why high- and low-SES students differ in their college application behavior, we draw upon a series of Heckman regression models with selection to estimate the selectivity of college applications using measures of performance differentials, choice differentials, unequal educational expectations, and the number of applications submitted. In general, the class of Heckman models are a common strategy for dealing with sample selection bias when observation of the outcome is not missing at random (Heckman, 1979; Stolzenberg & Relles, 1997). Thus, Heckman selection models are appropriate for purposes of our study since, in our case, we expect systematic differences in who applies to college (i.e., censoring bias). For example, we know SES predicts whether a student will apply to any college (Odle & Magouirk, 2023). We utilize the heckman command in Stata to implement a two-step model that first estimates a probit equation to predict the probability of applying to college. The second equation then fits a regression model of the highest selectivity level of college applications, conditional on applying. For this analysis, we draw upon the same set of predictors in both the selection and outcome equations, except we omit the number of applications in the selection equation, since inclusion of this variable leads to model degeneracy.

To compare the relative contribution of the rational action model with the status attainment model, we perform a Blinder-Oaxaca decomposition analysis of the application gap between students from the top and bottom SES quintiles. The Blinder-Oaxaca decomposition approach is commonly used to study mean outcome differentials between two groups (Blinder, 1973; Oaxaca, 1973). We utilize the oaxaca command in Stata (Jann, 2008) to carry out the decomposition analysis. Doing so allows us to divide the gap in college applications into three components: (1) the part that is due to group differences in the predictors (or “endowments effect”), (2) the part that is due to differences in the coefficients including the intercept (or “coefficients effect”), and (3) an interaction term that accounts for the simultaneous interplay between the first two components (Jann, 2008). For the estimation parameters, we draw upon the same Heckman selection model specification as the full Model 5. Consequently, this part of the study will provide an estimate of the contribution of each theoretical factor to the gap in college applications by socioeconomic background. Doing so allows us to make comparisons of the relative contribution of each theoretical mechanism to the SES-based gap in college applications.

Model 1 provides a baseline estimate of inequality in college applications by student SES without any covariates. Model 2 adds the individual- and school-level control variables. Model 3 includes the measures from the rational action model and Model 4 includes the measures from the status attainment model. Model 5 is the full model that includes all measures analyzed in this study. Finally, to test whether the link between expectations and applications differs by socioeconomic background, we run an additional analysis (Model 6) that includes an interaction term between SES and college expectations (type).

Results

Figure 1 provides a visual representation of the gap in college application selectivity by student SES. Namely, the two histograms displayed highlight the distinct distributions in college applications between students from the top and bottom SES quintiles. We can see that whereas high-SES students tend to apply to more selective colleges (i.e., higher rejection rate or lower admissions rate), low-SES students tend to either apply to less selective colleges or none at all. To unpack why we see these unequal patterns in college applications by socioeconomic background, we first undertake a bivariate analysis before turning to the multivariate regression models.

Fig. 1
figure 1

Distribution of College Application Selectivity by Top and Bottom SES Quintiles. Note: Estimates are limited to those with a high school degree or equivalent. SOURCES: U.S. Department of Education, National Center for Education Statistics, High School Longitudinal Study of 2009 (HSLS), 2013. U.S. Department of Education, National Center for Education Statistics, Integrated Postsecondary Education Data System (IPEDS), 2011-2012

Like the histograms, Table 1 highlights significant differences in college applications between students from the top and bottom SES quintiles. As expected, we see that high-SES students not only are more likely to apply to any college, but when they do, they tend to apply to more selective colleges compared to their low-SES counterparts. In turn, looking at the key measures shown, we find that most factors differ significantly between high- and low-SES students. For example, in terms of performance differentials, we see that high-SES students tend to have significantly higher GPAs and test scores than low-SES students (p < 0.001). Likewise, we see that high- and low-SES students vary in several important ways in terms of access to informational resources and the factors they consider when making their college choice decision. Finally, we see that high- and low-SES students differ significantly in terms of how far they plan to go in school (p < 0.001), the type of college they plan to attend (p < 0.001), and the number of applications they submit during the college application phase (p < 0.001). To understand how these differences may matter for inequality in college applications by SES, we turn to a series of Heckman selection models.

Table 1 Bivariate analysis of key measures between top and bottom SES quintiles

Results of the Heckman selection models in Table 2 provide three main takeaways. First, across models we find that the inverse Mills ratio is significant, thus providing statistical support for use of the selection model over standard OLS. Second, whereas the baseline Model 1 indicates that a one-unit increase in the SES composite score is associated with an increase in the application selectivity (or rejection rate) by 23.20 (p < 0.001), in the full Model 5 this relationship drops to just 1.06 (p < 0.001). Third, from Models 3 and 4 it seems that the rational action model has greater explanatory power compared to the status attainment model based on the relative size of the SES coefficient across models. Namely, the SES coefficient decreases more in Model 3 compared to Model 4. However, to get a more precise estimate of how much performance differentials, choice differentials, educational expectations, and the number of applications, contribute to the SES-based gap in college applications, we turn to the Blinder-Oaxaca decomposition analysis.

Table 2 Heckman regression model of college application selectivity (N = 15,130)

From the results of the Blinder-Oaxaca decomposition analysis shown in Fig. 2 (also see Appendix Table 6), we get a deeper understanding of how the underlying mechanisms contribute to the SES-based gap in college applications. First, we see that around 85% of the application gap is due to group differences in the predictors (i.e., “endowments effect”), and 15% of the gap is due to differences in the coefficients (i.e., “coefficients effect”). Although not shown, the “interaction effect” overall did not significantly contribute to the gap in applications (p > 0.10). Second, when we compare the contribution of the two sociological models in terms of the endowments effect, we see that factors from the rational action model contribute relatively more to the college application gap. Namely, 60% of the endowments effect is due to rational action mechanisms, whereas 35% is due to status attainment mechanisms. Finally, while several factors contribute significantly in terms of the coefficients effect, one that is particularly relevant is the type of educational expectations. To further examine this, we test the interaction between SES and educational expectations.

Fig. 2
figure 2

Relative contribution to gaps in college application selectivity between top and bottom SES Quintiles from Blinder-Oaxaca decomposition analysis. Note Estimates are limited to those with a high school degree or equivalent and are conditional on those applying to college. SOURCES: U.S. Department of Education, National Center for Education Statistics, High School Longitudinal Study of 2009 (HSLS), 2012, 2013. U.S. Department of Education, National Center for Education Statistics, Integrated Postsecondary Education Data System (IPEDS), 2011-2012

Model 6 in Table 3 reveals a significant interaction between SES and educational expectations (type) (0.03; p < 0.01). Thus, we find evidence of differential returns to educational expectations by SES in both the Blinder-Oaxaca decomposition analysis as well as this final model. To better grasp how the lower- and higher-order terms combine to shape selective college applications, we utilize Stata’s margins command to produce a predicted plot of the focal relationships. First, we generate a dichotomous measure of “selective colleges” based on whether the institution accepts less than half its applicants (yes = 1; no = 0), which is roughly equivalent to Barron’s “most competitive” and “highly competitive” categories (see Appendix Table 7). Next, we run a heckprobit selection model with the interaction terms to estimate the likelihood of applying to a selective college. Finally, we use margins to estimate the predicted probability of applying to selective colleges at varying levels of SES and educational expectations, while holding all other factors at their mean values.

Table 3 Heckman regression model testing interaction effect of SES and educational expectations on college application selectivity (N = 15,130)

In general, Fig. 3 highlights that across SES as expectations increase (blue → red), so does the predicted probability of applying to selective colleges. For example, low-SES students who expect to attend an institution with an 80% acceptance rate (i.e., light blue dot at −1) have an 11% predicted probability of applying to a selective college compared to 23% among those who expect to attend a college with a 40% acceptance rate (i.e., orange dot at −1), controlling for all other factors. We also observe patterns related to the interaction effect. For instance, among high-SES students with the same expectations as the low-SES students just discussed, their predicted probabilities of applying to a selective college are 11% (i.e., light blue dot at 1) and 32% (i.e., orange dot at 1), respectively. Thus, we see that among average-performing high school students in the U.S., high expectations do not translate into selective applications equally for those from high- and low-SES backgrounds. In other words, the returns to high expectations, in terms of selective applications, seem to pay off most for those from more advantaged backgrounds even when accounting for a host of factors that are known to matter for college admissions.

Fig. 3
figure 3

Likelihood of selective applications by SES and college expectations (Type). Notes: Estimates are limited to those with a high school degree or equivalent and are conditional on those applying to college. All other factors held at their mean values. SOURCES: U.S. Department of Education, National Center for Education Statistics, High School Longitudinal Study of 2009 (HSLS), 2012, 2013. U.S. Department of Education, National Center for Education Statistics, Integrated Postsecondary Education Data System (IPEDS), 2011-2012

Discussion

Although most high school graduates in the U.S. make the transition to some type of college, gaps in where students apply are evident by socioeconomic background (An, 2010; Bowen et al., 2009; Holzman et al., 2020). While a substantial body of work has shown that higher-SES students tend to apply to more selective colleges than their lower-SES counterparts (Hoxby & Avery, 2012; Mullen & Goyette, 2019; Radford, 2013), we know relatively less about why students differ in their application behavior. In this study, we draw upon a sociological approach to compare the rational action model with the status attainment model of educational stratification. Utilizing data from the High School Longitudinal Study of 2009, and a series of Heckman selection models, we find that mechanisms related to the rational action model contribute relatively more to the SES-based application gap compared to the status attainment model, although both are important. We also reveal a significant interaction effect between SES and the type of educational expectations.

This study thus adds to our understanding of the processes that lead to unequal sorting by SES during the college application phase (Holzman et al., 2020). Although past work has helped to uncover some of the factors related to the SES-based gap in college applications (Cabrera & Nasa, 2000a; Mullen & Goyette, 2019; Roksa & Deutschlander, 2018), to our knowledge, this study provides the most comprehensive analysis to date of the underlying mechanisms that contribute to the observed disparity. First, from the Blinder-Oaxaca decomposition analysis, we find that 85% of the gap in college applications between those in the top and bottom SES quintiles is due to the endowments effect, or differences in the predictors, while 15% is due to differences in the coefficients, or the portion left unexplained. Thus, we can explain most of the SES-based gap in college application selectivity through the factors modeled in this study. Second, we estimate that 60% of the endowments effect is due to rational action mechanisms, while 35% is due to status attainment mechanisms. Consequently, although prior work has tended to focus on either the rational action mechanisms or the status attainment mechanisms, we show that both are important for fully understanding the SES-based gap in college applications.

In terms of the rational action model, this analysis confirms the importance of performance differentials in shaping unequal applications by SES (Holzman et al., 2020). The Blinder-Oaxaca decomposition analysis shown in Fig. 2 indicates that 43% of the SES-based selectivity gap in college applications is due to performance differentials, with 29% just from standardized tests. In contrast to prior work that has focused exclusively on high-performing students (Hoxby & Avery, 2012; Lor, 2023; Radford, 2013), our study indicates that choice differentials do not contribute much to the gap in college applications. Specifically, we estimate that only about 8% of the application gap between top and bottom SES quintiles is due directly from differences in access to information and college considerations. Overall, then, it seems that in the case of SES-based disparities in college applications, secondary effects play a relatively minor role compared to primary effects (Jackson 2013). We suspect, however, that some of the secondary effects may operate indirectly through its association with educational expectations.

In terms of the status attainment model, this study highlights the importance of educational expectations. Whereas prior work has tended to focus on level of expectations (Mullen & Goyette, 2019), we show that type of expectations seems to matter more. Namely, results shown in Fig. 2 indicate that only 4% of the SES-based gap in college applications is due to the level of expectations, while 13% is due to the type of expectations. It is important to note, however, that this analysis focuses on college application selectivity rather than application to college. Since our modeling approach conditions on application to any college, it is likely that the level of expectations (i.e., how far) matters more for predicting whether or not a student applied at all (see selection equation section in Appendix Table 5). In turn, as shown in Table 3, we see that SES moderates the relationship between type of expectations and the selectivity of college applications. Figure 3 highlights that the payoff to higher expectations in terms of selective applications disproportionately accrues to higher-SES students.

This finding thus provides insight on an additional source of advantage for high-SES students during the college application process. For example, average-performing, low-SES students with the highest expectations have a 31% predicted probability of applying to selective colleges, while average-performing, high-SES students with the same expectations have a 46% predicted probability—holding all other factors at their mean values.Footnote 6 This provides evidence that high-SES students are more likely to enact their plans and apply to selective schools regardless of their measured performance. Consequently, these results indicate that equalizing access to information or college considerations will not necessarily lead to equal application behavior among students from differing class backgrounds. Even among students who expect to attend the same type of selective college—and thus must already know about these schools and plan to attend—still exhibit differential application behavior by socioeconomic background. Future research is needed to better understand when and why this disconnect occurs at the application stage. It may be that some low-SES students do campus visits in the summer after their junior year where they have a negative experience that deters them from applying to selective colleges (Radford, 2013).

From a policy standpoint, our results have implications for those aiming to increase the share of low-SES students in the pool of applicants at selective colleges. Specifically, our results reinforce the need to address the SES-based performance gap in high school to bring about greater equity during the transition to college. While past work has tended to focus on the admissions side of the equation, we draw attention to the application side as well. Our analysis shows that students sort into different pools of applicants based on their grades and test scores. In other words, we observe that students largely align their college applications with their own performance metrics.Footnote 7 This may arise due to student awareness of the relevant components and academic thresholds specified by a given institution in the admissions process. For example, students may decide where to apply in part based on how competitive they feel they would be for admission to a given college. However, because many institutions claim to base admissions decisions on a host of academic as well as non-academic factors, there is likely a larger pool of missed talent among low-SES students than previously discussed (Hoxby & Avery, 2012). For instance, recent work has shown that, regardless of academic qualifications, applying to “reach” schools increases the likelihood of enrolling at a more selective institution (Mullen & Goyette, 2019).

This study has some important limitations. First, because our data was collected prior to the pandemic, it is not entirely clear how the underlying relationships may have shifted since then. For example, we know that since the pandemic, many colleges and universities have switched to test-optional or test-blind admissions. With these changes schools likely place greater emphasis on grades, and as a result, students may align their applications with their grades instead of test scores. If this were the case, it is unlikely that we would see much difference from the patterns observed here since grades and test scores are moderately correlated. Second, we need a better understanding of why low-SES students even with high expectations do not apply to selective colleges at equal rates to their high-SES counterparts. Our work offers more support for research on the application process itself, and how to decrease (or eliminate) the barriers that students face as they apply to college (Odle & Magouirk, 2023).Footnote 8 Finally, while our data and analysis has allowed us to undertake a broad examination of the factors that drive differential applications to selective colleges, we acknowledge that our decomposition approach models predictors at one point in time. Future data collection and research would benefit from greater attention to the dynamic interplay of the underlying factors as they emerge over time—in other words, investigating the longitudinal process behind these patterns.