Introduction

Improving student retention is a key concern for Australian higher education institutions and is often employed as a metric of institutional performance. Australia currently sits near the mean of Organization for Economic Co-operation and Development (OECD) nations in terms of the percentage of new students who complete a Bachelor’s degree within five years (70%; Department of Education and Training, 2018). Attrition rates at Australian universities have fallen over time as Government contributions to university fees have fallen and students accrue greater debt per unit. However, a push for participation targets of non-traditional student cohorts may lead to this number again rising (Department of Education and Training, 2018). Student attrition has direct implications for university income through lost student fees (Shah & Nair, 2010). Retention rates may have a future impact on Australian universities’ federal funding allocation (Department of Treasury, 2017). Degree completion also has implications at the individual level, as undergraduate qualifications are associated with a higher salary, and greater career satisfaction and well-being (Cassells et al., 2012; Graduate Careers Australia, 2016; Hillman & McMillan, 2005). Additionally, there is a relationship between degree completion and employment, such that individuals holding the highest level qualifications (e.g., Masters, Doctorate) are more likely to be employed than those with only a high school certificate (Wilkins & Lass, 2018). Tracking students over time, we know that approximately half (46.9%) of Australian domestic students who withdraw from their studies will return to university within eight years (Harvey et al., 2017). However, students who do not return within this time-frame are unlikely to complete a degree. The most common reasons cited by Australian domestic students for discontinuing their degrees include enrolling at another university, change of career path or employment, academic difficulties, mental health, or financial reasons (Harvey et al., 2017). Consequently, consideration of what encourages students to complete their enrolled course at university has been a commonly examined issue within the higher education literature (Tight, 2020).

Student retention

The literature has failed to identify a common definition for student retention, with no consesus as to whether a student is “retained” when leaving a particular course or intermitting from their studies. Tight (2020) conceptualized that student retention was synonymous with other related terms, such as student withdrawal, attrition, and dropout. Within this study, we approach retention at the university level, such that a student is retained if they stay at their university to complete their degree. Despite over four decades of research into student retention, efforts to create measures for predicting students most likely to withdraw is still emerging. Historically, student retention in tertiary institutions has been addressed using multivariate perspectives, such as Tinto’s (1975) schema of college dropout predictors. Tinto proposed that characteristics of the individual (e.g., attributes, family background) contributed to their degree of academic and social integration within a university. Drawing cross-disciplinary comparisons to clinical models predicting suicidality, Tinto (1975) proposed that individual shortcomings in academic and social integration would enhance the likelihood of an individual dropping out of college. However, contemporary approaches have acknowledged the additional role of contextual factors in shaping the student experience. Tight’s (2020, p. 693) summary of the practical outcomes of student retention research importantly noted that the reserach “...should not be about helping students to better adapt to the [instituion] ...but about the institution adapting to the students it admits.” Echoing commentary by Zepke and Leach (2005), Tight (2020) noted that contemporary universtities attempting to create student-centered, individualized support to address this changing focus faced challenges due to the increasing numbers of non-traditional university students (e.g., mature age students; as predicted by the Australian Department of Education and Training, 2018). Further research is needed to better understand the aspects of universities that can be modified to support degree completion.

The student retention research draws parallels with research conducted in other disciplines of psychology, specifically that of industrial and organizational psychology. Although student dropout and employee turnover may appear to be distinct fields of inquiry, Larkin et al. (2013) has argued the analogous manner in which consideration of the individual and contextual factors that predict a student’s dropout from university is not too dissimilar from how organizational researchers have addressed employee turnover. The student retention literature has largely failed to benefit from the extensive literature into employee turnover (Larkin et al., 2013). Examining how other disciplines have attempted to conceptualize and address contextualized retention strategies in non-university contexts may offer alternative practical solutions to measuring factors relevant to enhancing student retention.

Several researchers have considered tertiary student retention beyond the characteristics of the student and incorporating contextual aspects that relate to turnover within a workplace (e.g., Bean, 1980, 1983; Cabrera et al., 1993; Johnson et al., 2014). Bean (1980, 1983) was among the first to adapt a model of employee turnover to understanding student retention, citing stay decisions to be the result of a complex interactions between student-centered variables (e.g., GPA, goals), as well as contextual factors both internal and external to the university. Bean (1980) found that various insititutionally based predictors accounted for unique variance in student intentions to leave their university (e.g., perceptions regarding the educational quality of the institution), which were akin to some of the factors proposed to preceed employee turnover in Price’s (1977) model of workplace turnover. Recent findings (e.g., Morganson et al., 2015) suggest that this cross-disciplinary approach to understanding tertiary student retention may be fruitful; specifically, applying the organizationally derived theory of embeddedness to the prediction of student retention (Larkin et al., 2013).

Embeddedness

Historically, the processes leading an employee to stay at an organization were thought to be the obverse of the processes leading an employee to leave an organization (Jiang et al., 2012). A series of researchers criticized this position (e.g., Mitchell & Lee, 2001), who believed that the steps precipitating leaving employment differed from the steps associated with remaining. Mitchell and Lee (2001) described embeddedness as a construct that reflects the degree to which one feels enmeshed or (when viewed less-positively) “stuck” within an organization, reducing their prospects of leaving. Perceptions of embeddedness emerge from three contextual forces: Fit, Links, and Sacrifice. Fit refers to the level of similarity between an individual and their organization (e.g., values similarities) and is described as non-affective in scope (i.e., not encompassing the degree of “liking” the organization; Mitchell & Lee, 2001). When applied to the context of tertiary education, Fit could encapsulate perceptions of fit between the individual and their major of study. Links represent the relationships an individual has with their colleagues and other members of their workplace social network (Mitchell & Lee, 2001). Sacrifice refers to what an individual perceives that they would lose if they left their organization (Mitchell & Lee, 2001). Together, these three forces bind an individual to their organization, such that the more embedded an employee is to their organization, the more likely they are to remain at their organization (Lee et al., 2014). The predictive validity of embeddedness is well supported, with a meta-analysis finding embeddedness accounted for significant incremental variance in predicting turnover intentions and actual turnover, after controlling for job satisfaction, perceived job alternatives, and affective job commitment (Jiang et al., 2012).

Embeddedness as a predictor of student retention

The job embeddedness model may be relevant to understanding why tertiary students stay or leave their university (Larkin et al., 2013). Significant relationships have been demonstrated between embeddedness scores, student intentions to leave, and actual attrition (Larkin et al., 2013). A qualitative study conducted by Morganson et al. (2015) suggested that university embeddedness may be useful in predicting retention in STEM majors, finding that Fit, Links, and Sacrifice were referenced by students as factors that enhanced their student experience. In terms of Fit, person-environment fit perceptions have been identified as a predictor of remain intentions for university students, even after controlling for the personality traits of the sampled students (Etzel & Nagy, 2016). Furthermore, mixed findings have been identified regarding the impact of student loans and financial support on retention considerations for university students, which are factors that may contribute toward Sacrifice perceptions (Kirby & Sharpe, 2001). Further, social ties (Links) have been shown to ameliorate the relationship between anxieties and academic performance at university (Brook & Willoughby, 2015). 

Embeddedness therefore appears to be a means of examining how supported students perceive their university experience to be, which Tight (2020) has previously outlined as a contextual factor that universities need to address to enhance student retention. Therefore, these individually anchored perceptions of Fit or congruence, the adequacy of their social ties per the Links factor, and the threat of resource loss per the Sacrifice factor, provide an alternative means of predicting how likely a student is to remain enrolled beyond traditional demographic-based factors, such as student age or socio-economic status (Kirby & Sharpe, 2001).

Although these findings present early evidence of embeddedness being an efficacious predictor of tertiary student retention, this research has been somewhat thwarted by a lack of an established measure of student Fit, Links, and Sacrifice. There remains no “gold standard” for measuring embeddedness in a tertiary student population, with the aforementioned studies (Larkin et al., 2013; Morganson et al., 2015) using either existing measures of embeddedness conceived in an organizational context, or only measuring concepts conceptually similar to embeddedness, such as campus involvement. There is a need to improve upon these existing approaches to the measurement of tertiary embeddedness, using the language and embedding facets identified by students.

Study structure and aims

This article details the process to provide a qualitative basis for the development of a student embeddedness scale (study one, part one), followed by two quantitative examinations of the scale’s scores for construct (study one, part two) and concurrent (study two) validity. Specifically, the objective of study one was to develop a new embeddedness measure for use with a tertiary student population after gathering evidence of the applicability of the embeddedness construct to student retention considerations. Mixed-method approaches to scale development allow for the collection of multiple sources of evidence to support scale validity (Newman et al., 2013; Zhou, 2019). Consistent with scale development methodology outlined by Zhou (2019), this mixed-methods approach begins with qualitative methods in the initial stages of item development (i.e., interviews; study one, part one). The qualitative methods allowed for the investigation of the applicability of embeddedness theory to a student context and the grounding of item wording to the student experience. Subjecting the developed items to qualitative review by subject matter experts then allowed for the collection of additional evidence of item content validity. A pilot of the developed scale (study one, part two) was administered to a separate student sample. The scale (composed of subscales reflecting Fit, Links, and Sacrifice constructs) was subjected to quantitative review to estimate the construct validity of the scale’s scores, and to make recommendations regarding the generalizability of the scale’s use (per Zhou, 2019). The polytomous Rasch measurement model (Andrich, 1978) was employed to build and quantitatively assess the efficacy of the measure’s subscales.

Study two aimed to expand upon these findings by administering a refined version of the scale to another student sample. The aim was to collect and examine exploratory evidence of measure validity, namely how these scales relate to student retention intentions as well as with with established predictors of student retention, i.e., academic self-efficacy, academic goals, and academic coping (for studies linking these three variables to student retention see Devonport & Lane, 2006; Hess & Copeland, 2001; Multon et al., 1991; Pascarella & Terenzini, 1980; Robbins et al., 2004; Simon et al., 2015). These studies, therefore, aimed to advance embeddedness theory in education research by expanding upon the qualitative study by Morganson et al. (2015), and exploring whether embeddedness is relevant in explaining retention intentions in university students. Following the qualitative and quantitative mixed approaches to these studies, a discussion and conclusions will be presented.

Study one (part one) method and results: Interviews and item generation

Participants

Interview participants consisted of 15 undergraduate students (age range 18–62, M = 24 years, SD = 11 years, 46% identified as female), who were recruited from an Australian university, across academic disciplines. Participants were recruited through social media, posters, and snowball sampling.

Procedure

Interviews

A series of one-on-one interviews were conducted to determine whether the Fit, Links, and Sacrifice dimensions related to students’ self-reported reasons for remaining enrolled at university or in their majors. A semi-structured interview schedule was employed, consisting of broad, open-ended questions about the students’ experiences, combined with questions specific to embeddedness with their institution and current academic major, such as: “Can you describe the kinds of social ties and other relationships you have while studying at university?”. Deductive thematic analysis was conducted to analyze patterns, themes, and categories within the data, following the procedures described by Braun and Clarke (2006). This approach allowed for data to be analysed in a top-down manner, congruent with a predetermined theoretical framework, i.e., the dimensions of embeddedness. Interpreting the verbatim interview transcripts began with the second author reading the transcripts to become familiar with the data, and then highlighting text relevant to the dimensions of embeddedness. After the second author independently coded three interview transcripts, all authors met and generated a list of codes to guide the coding of the remaining interview transcripts. To ensure inter-rater reliability, a random sample of transcripts were recoded by the first author, and compared for consistency with the second author’s coding. The researchers met to discuss any disagreements in coding until 100% agreement was reached. Career opportunities, access to desired resources, value for course, and value for learning were identified as the four sub-themes of Fit. Valued peer relationships and valued staff relationships were identified as the two sub-themes of Links. Opportunities for socialization, sense of direction, and sunk costs were identified as the three sub-themes of Sacrifice. These themes and sub-themes are outlined in Appendix 8.1, Appendix 8.2, and Appendix 8.3.

Developing the item pool

Items were developed to reflect the findings of the thematic analysis. Additional items were adapted from a series of existing measures related to embeddedness and retention, such as measures of subjective fit perceptions (Cable & DeRue, 2002), person-environment fit, student-institution fit (Denson & Bowman, 2015), academic and social integration (Pascarella & Terenzini, 1980), job embeddedness (Mitchell & Lee, 2001), and global job embeddedness (Cunningham et al., 2005). Where appropriate, items were re-phrased to ensure suitability for a tertiary student population, for example, “I’m too caught up in this organisation to leave,” was rephrased as “I’m too caught up in this university to leave.”

SME item evaluation

Five subject matter experts (SMEs), working in student retention (university student support officers), were recruited from an Australian University via email. For each of the items, SMEs had to indicate how important they believed the item was to understanding retention on a five-point Likert-type scale, ranging from 1 “not at all important” to 5 “extremely important.” SMEs were also required to indicate whether they believed each item was burdensome, required the responder to reveal sensitive information, prone to socially desirable responses, or difficult to comprehend. Items marked “slightly important” and “not at all important” to retention were considered for deletion. Items that were identified as problematic but were identified as important for retention were reworded, or deleted if rewording failed to solve the issue identified by the SME. Based on SME judgement, the original item pool of 123 items was reduced to 83 items, which were then subjected to examination against the assumptions of a Rasch model.

Study one (part two) method: Scale pilot testing and calibration

Participants

For examination of the scale properties of the 83-item measure in relation to the assumptions of a polytomous Rasch model, data from 299 Australian tertiary students were collected (M = 26.7 years, SD = 9.75 years, 82.3% identified as female). Highest education levels of the participants’ parents varied, with 37.4% having completed high school, 39.5% having completed Australian Qualifications Framework (AQF) levels 1 to 7 (i.e., Certificate I to Bachelor’s Degree levels), and the remaining parents with AQF levels 8 to 10 (i.e., Graduate Diploma to Doctorate) qualifications as their highest level of education. The language most commonly spoken at home for students was English (91.6%), and most of the participants entered university either via tertiary entrance ranking examinations (52.5%) or via mature-aged student pathways (31.8%). Most of the sampled students had completed at least one semester of university prior to participation in the study (89.6%), and the most commonly reported course of study was Health Sciences (45.5%). Most (89.3%) participants completed all questionnaire items, with incomplete data included as part of the analyses for each subscale. This sample differs from an average Australian tertiary student sample, which has a lower percentage of female students (56.8%) and fewer students studying health courses (16.1%; Universities Australia, 2019). However, the high level of females in this sample is reasonably representative of the gender split in Australian Health Sciences courses, with 72% of students enrolled in health courses identifying as female (Australian University Network, 2015).

Procedure

The study was advertised to students via social media platforms (e.g., Facebook, Reddit), and posters placed on Australian university campuses. Participants voluntarily participated in this study after reading a participant information letter, and completing a consent form. The questionnaire was presented using Qualtrics survey software (Qualtrics Labs, Inc., 2019). Although convenience sampling may introduce potential sampling bias, item invariance, according to gender, age, median splits of age, Index of Relative Socioeconomic Advantage and Disadvantage (IRSAD) scoring based on current postcode, and IRSAD scoring based on high school postcode, was directly explored as part of the scales’ score adherence to the assumptions of the Rasch measurement model, and limitations regarding generalizability are explored further in the Discussion. The initial examination of the adequacy of the Fit, Links, and Sacrifice measures was undertaken via an evaluation against the assumptions of a Rasch measurement model. A rating scale polytomous Rasch model (Andrich, 1978) was employed for each embeddedness measure instead of a partial-credit model (Masters, 1982), due to homogenous response option presentation (i.e., a 1–6 Likert-style format) for all items. Model adequacy was examined on the basis of person reliability and stratification indices, and item reliability and stratification indices (i.e., larger values being indicative of greater reliability in person or item ordering), in addition to measure targeting (i.e., to reduce the prospect of ceiling/floor effects the average point of measure “difficulty” should be close to the average point of person ability).

Study one (part two) results: Scale pilot testing and calibration

Fit measure

Univariate structure and independence

The initial set of 32 Fit items were examined for a univariate structure via a principal components analysis of the model’s item residuals.Footnote 1 Six items were identified as a potential off-measure cluster due to their notable eigenvalue coefficient (4.02, 7.2% measure variance) and borderline disattenuated r = 0.72 coefficient with another residual cluster. As these items appeared to have a similar basis of reflecting global perceptions of Fit with the university (e.g., “I fit well with my university”), which differed from the specific facets of fit targeted by the remaining items (e.g., “I fit well with my course”), these items were removed from further analysis. No further dimensionality concerns were identified on the basis of eigenvalues or disattenuated correlations for the remaining clusters. Tests of local item independence via calculation of Yen’s Q3* coefficients were conducted in a iterative manner.Footnote 2Q3* coefficients were recalculated following the removal of an item from a pair violating the local independence assumption. The item removal decisions were based upon item fit characteristics (e.g., Infit/Outfit coefficients), and judgements of item content synonymity, until no further violations of independence were noted. Fourteen items (e.g., “I am suited to my course”) demonstrating item dependence were removed in an interative manner. Table 1 outlines the change in measure targeting, person reliability coefficients, and item reliability coefficients to the retained 12 items.

Table 1 Iterative measure, person-reliability, and item-reliability coefficients per measure

Response category adequacy

Facets of the response categories, such as adequate category fit (i.e., OutfitMnSq < 2.00), non-subsumed peaks on a plot of the category probabilities, and consistency between predicted and observed responses via a confusion matrix, were inspected to ascertain the efficacy of response categories (Linacre et al., 2002). The six response categories of the Fit measure suggested poor discrimination between categories by participants (see Fig. 1), which was supported by disordered thresholds and poor response category fit upon inspection of the Outfit MnSq coefficients for the “strongly disagree,” “disagree” and “somewhat disagree” response options. Collapsing the “strongly disagree’ and “disagree” options partially addressed the aforementioned issues, and lead to trialling the collapse of all “disagree” response categories, such that the measure retained a general “disagree” option alongside the original “somewhat agree,” “agree,” and “strongly agree” options. This recoding method retained adequate person/item reliability, while resolving the threshold ordering and category redundancy concerns.

Fig. 1
figure 1

Response category probability curves for the original six option scale of the Fit measure (left), and probability curves following category collapse (right)

Item misfit

Two items, “By finishing my studies, I will have access to high-income jobs” and “The structure of my course suits me well,” were judged by the authors to be misfitting.Footnote 3 As neither item occupied a niche position in the measure’s array of item difficulties, they were removed from further analyses.

Invariance

On the basis of age, the items “Expanding my knowledge is important to me” (Mantel-Haenszel χ2 = 9.56, p = .002, ∣DIFContrast∣ = 0.85), and “The courses available in this university match my interests” (Mantel-Haenszel χ2 = 9.06, p = .003, ∣DIFContrast∣ = 0.71) appeared to demonstrate response bias.Footnote 4 For the first item, older students at an equivalent estimated level of the underlying Fit construct found this item easier to endorse in comparison to their younger peers. Inversely, younger students found the second item comparatively easier to endorse. Following removal of the first item, evidence of DIF remained for the second item; therefore, both were removed. No further evidence of invariance was noted.

The final measure of Fit consisted of eight items with four response categories, which explained 50.1% of the variance in participant responses to these items. Table 2 presents the item fit information, and Table 1 presents the iterative measure properties in terms of person/item reliability for each step. In summary, a satisfactory measure of Fit was established, although we noted a potential ceiling effect, as indicated by the positive targeting coefficient, which should ideally be close to a value of zero to indicate better targeting of average Fit.

Table 2 Item fit and correlation coefficients for the Fit measure

Links measure

Univariate structure and independence

The 32-item Links measure presented immediate concerns of multidimensionality; the eigenvalue (8.37, 12.7% of observed variance) of the first component extracted from the model residuals was notable, and examination of the contrast plot indicated a vertically distinct item cluster. Accompanied by the disattenuated r = 0.61 coefficient (i.e., <50% shared variance), and the off-measure item content reflecting university faculty or professional links, we removed these 12 items temporarily for separate analysis, and continued with the remaining 20 Links items pertaining to peer-related links. The retained items demonstrated no further evidence of multidimensionality (e.g., the lowest disattenuated correlation between clusters was r = 0.96).

Examination of local item independence violations for the 20 items was conducted per the prior Fit analyses. Five items were removed in this manner (e.g., “I feel connected to the other students in my course,” Q3∗ = .43). Table 1 reflects the change in measure targeting and reliability as a consequence of these item reductions.

Response category adequacy

Potential evidence of poorly discriminating response categories appeared to be a potential weakness of the Links measure (see Fig. 2). Poor discrimination was most apparent between the three “disagree”-based categories; however, an improvement in category discrimination and measure adequacy was noted following the collapse of the three categories into a general “disagree” category.

Fig. 2
figure 2

Response category probability curves for the original six option scale of the Links (peers) measure (left), and probability curves following category collapse (right)

Item misfit

Four misfitting items (e.g., “I am connected to clubs and societies within my university”) were removed in an iterative manner after failing to fall within the Infit/Outfit coefficient thresholds indicative of adequate fit as detailed in the prior Fit measure results. Eleven remaining items that satisfied the fit assumption were retained.

Invariance

The IRSAD value of high school postcode was a potential factor related to differential responding on the item “My friendships enrich my overall university experience” (Mantel-Haenszel χ2 = 6.26, p = .012, ∣DIFContrast∣ = 0.67). This item appeared to be more difficult to endorse for lower IRSAD category participants, despite the measure predicting similar underlying levels of perceived Links. After removing this item, no further evidence of DIF was found.

The final Links measure, which focused on peer-related links, demonstrated adequate measurement properties as a 10-item, four response category measure (see Table 3 for item fit coefficients). The near-zero targeting of the final measure, per Table 1, suggested that this aspect of the measure was a satisfactory match between average participant Links and the difficulty of the Links items.

Table 3 Item fit and correlation coefficients for the Links (peers) measure

Secondary links measure

Univariate structure and independence

The previously extracted 12 items representing university staff-related links from the initial Links dimensionality testing were separately examined for consistency with the Rasch model assumptions. While the eigenvalue (2.12, 7.8% variance) of the largest off-measure cluster extracted from the principal components analysis of model residuals exceeded 2.0, the smallest disattenuated correlation (r = 0.86) between clusters did not suggest notable violations of the univariate measurement assumption. Following removal of two items that failed the item independence analyses described previously (e.g., “The academics in my chosen field of study are approachable”), the remaining ten items were retained for further analysis.

Response category adequacy

In a similar outcome to that of the peer-based Links measure, limitations regarding the discrimination of the “disagree”-based categories was again evident (see Fig. 3). The solution of iteratively collapsing the “disagree” categories until no further potential response category concerns were observed was re-employed and carried on in all forthcoming analyses.

Fig. 3
figure 3

Response category probability curves for the original six option scale of the secondary Links (staff) measure (left), and probability curves following category collapse (right)

Item misfit

After removing one misfitting item (“I have developed professional relationships with people who have experience in my field of study”) using the methods outlined previously, no further concerns related to item misfit were noted.

Invariance

Age-related DIF concerns were noted for two of the items in the staff-based Links measure. The first concerning item, “My course gives me the opportunity to build professional relationships,” demonstrated a potential age-related response bias with younger students finding this item easier to endorse in comparison to older students (i.e., Mantel-Haenszel χ2 = 13.78, p < .001, ∣DIFContrast∣ = 0.83). A similar age-related bias was noted for the item “The relationships I have with staff will make pursuing my future employment pathway easier.” Following the removal of both items, no further evidence of DIF was noted.

Item fit information for the staff-based Links measure is presented in Table 4. In a similar finding to that of the peer-based Links measure, the staff-based Links measure appeared to be adequately targeted toward the average participants’ degree of perceived staff-based Links.

Table 4 Item fit and correlation coefficients for the Links (staff) measure

Sacrifice measure

Univariate structure and independence

The 19-item Sacrifice measure presented initial limitations in addressing the univariate structure assumption of the Rasch model. The principal components analysis of model residuals identified a cluster of seven items that had limited shared variance with the remaining clusters of item residuals (Eigenvalue = 3.11, 8.3% variance; r = 0.42, < 50% shared variance). Removing these items and re-examining the dimensionality of the measure suggested no further concerns regarding multidimensionality.

Examination of the assumption of local item independence presented two concerning items pairs based on the previously outlined methodology; following removal of dependent items, the remaining ten items were then examined for response category adequacy.

Response category adequacy

As demonstrated in Fig. 4, the discrimination of the response categories was generally weak, as indicated by the generally subsumed peaks of the response probability curves, and was compounded by disordered item thresholds. Attempting to address the poor discrimination of the “disagree”-valenced options by collapsing this side of the scale anchors improved the measure’s targeting, and person and item reliability coefficients (see Table 1). Due to continued poor evidence of the adequacy of the response categories following the merge of the “disagree” categories, collapsing the “somewhat agree“ and “agree” response options produced a further improvement in person reliability. This modification to the scoring of the measure produced the best available solution to the response category discrimination concern.

Fig. 4
figure 4

Response category probability curves for the original six option scale of the Sacrifice measure (left), the four category version (middle), and the three category version (right)

Item misfit

One item (“There are advantages to being at my university”) was excluded from further analyses due to poor item fit based on the previously outlined criteria for assessing this assumption.

Invariance

Older participants found it easier to endorse the item “I have committed a lot of time to this academic course” in comparison to their younger peers with comparable estimated Sacrifice perceptions, Mantel-Haenszel χ2 = 8.29, p = .004, ∣DIFContrast∣ = 0.81. Following the removal of this item, a further item “Changing to another university would be difficult” was similarly identified as more difficult to endorse by older participants in comparison to younger participants at equivalent levels of Sacrifice perceptions, Mantel-Haenszel χ2 = 14.23, p < .001, ∣DIFContrast∣ = 0.66. However, the removal of this item rendered the Sacrifice measure bereft of value, with the person reliability index coefficient < .50; therefore, further examination of this measure for the remaining DIF contrasts was halted.

Secondary sacrifice measure

Following the limited evidence of measurement adequacy for the original Sacrifice items, the seven extracted items constituting a notable cluster in the previous analysis were separately examined for measurement adequacy. No measurement issues with respect to the univariance and independence assumptions were observed. Response category adequacy for this shorter Sacrifice measure was addressed in the same approach as the previous Sacrifice measure. Two items demonstrated potential concerns regarding item misfit and were removed based on the item misfit criteria discussed for the previous measures. For the remaining five items, DIF limitations on the basis of participant age grouping and socioeconomic advantage were observed for many items, rendering the shorter Sacrifice measure untenable against the assumptions of the Rasch model. Examination of the shorter Sacrifice measure was halted at this stage of analysis.

Study two: Method

Participants

Data were collected from 196 Australian tertiary students (M = 24.5 years of age, SD = 8.36, 78.1% identified as female). The highest levels of completed education for participants’ parents varied across the sample, with 35.7% having completed high school, 47.4% having completed AQF levels 1 to 7, and the remaining students’ parents having AQF levels 8 to 10 as their highest education qualifications. All students that participated in this study were domestic students, with domestic students making up almost 66% of all enrolled Australia university students (Universities Australia, 2019). Most of the students reported that English was the most common language spoken at home (93.4%), and most students reported entry into their university course using their tertiary education examinations (54.6%). Echoing Study One’s sample, most of the participants had completed at least one semester of university before responding to the questionnaire (81.1%), and most of the students (64.8%) reported Health Sciences as their enrolled course type. Complete responses across all variables were recorded for 57% of the sample, with missing data addressed in the Results. As with Study One, although this sample differs from an average Australian tertiary student sample, it is similar to the gender spit in Australian Health Sciences courses (Australian University Network, 2015; Universities Australia, 2019).

Materials

University embeddedness

University embeddedness was measured with the 25-item University Student Embeddedness (USE) instrument developed in Study One, consisting of an eight-item Fit scale, a 10-item Links (peers) scale, and a seven-item Links (staff) scale. These scales employed four response categories: “disagree,” “somewhat agree,” “agree,” and “strongly agree.”

Academic self-efficacy

Academic self-efficacy refers to students’ perceptions of their ability to carry out tasks necessary for academic success (Chemers et al., 2001), and was measured using the five-item Academic Efficacy Scale from the Patterns of Adaptive Learning Scales (PALS; Midgley et al., 2000). Students were asked to indicate the extent to which they believed each of the statements to be true of their work at university, responding on a five-point Likert-type scale ranging from “not at all true” to “very true” (e.g., “I’m certain I can master the skills taught in class this year”). Midgley et al. reported acceptable reliability for this measure (α = .78).

Non-academic goal setting

Academic goals refer to one’s persistence and commitment toward academic-related action (Robbins et al., 2004). Non-academic goal setting was measured with the Desire to Finish College scale (Allen, 1999). This six-item scale measures the strength of goals related to completing university, with high scores indicative of poor academic goals. Students were asked to rank a series of statements on a 1 “not at all true” to 7 “completely true” Likert-type scale (e.g., “I dread the thought of going to university for several more years”). Allen (1999) reported acceptable reliability for this measure (α = .76).

Academic coping

Academic coping refers to the conscious regulation of thoughts, feelings, and behavior in response to particular academic events or circumstances, such as receiving a bad grade (Sullivan, 2010), and was measured with 15 items from the Approach sub-scale of the Academic Coping Strategies Scale (Sullivan, 2010). This measure specifically assessed the frequency that students engaged in specific strategies after receiving a low assessment grade. Participants responded on a five-point Likert-type scale ranging from 1 “never” to 5 “almost always” (e.g, “trying to find out what you did wrong”). Sullivan reported strong reliability (α = .91).

Intentions to stay at university

Students’ intentions to stay at university were measured using a single item: “How likely are you to remain enrolled at your university to complete your degree?”. Students responded to this item on a six-point Likert-type scale ranging from 1 “very unlikely” to 6 “very likely.”

Procedure

Participants were made aware of the study’s availability using similar methods to that of study one (part two) through the use of social media, posting on internet discussion boards, and using poster advertisements at Australian universities. Participants were asked to provide their demographic information, followed by the item assessing their intention to remain at university. The remaining measures (e.g., Fit, academic coping, etc.) were presented in a randomized order after the intention to remain item to lessen the probability of retention being influenced by the salience of embedding factors. Participants took approximately 15 minutes to complete all measures.

Study two: Results

Correlations with academic predictors

As would be expected of this variable, the stay intent data were negatively skewed, such that most participants indicated an intention to remain with their university. Subsequent efforts to transform the data to achieve normality were unsuccessful. Non-parametric bivariate correlations between the academic, embeddedness, and retention variables were instead calculated. Due to non-forced responding in the questionnaire, pairwise correlation estimates were calculated for each variable pair, with the smallest count of cases for these correlations (n = 112) suggesting that the analyses were adequately powered. As presented in Table 5, the Fit measure was significantly correlated with intent to stay. Although neither Links (peers) or Links (staff) related to intent to stay, both were significantly related to academic coping, and negatively related to non-academic goal setting. Additionally, Links (staff) was positively related to academic self-efficacy. Other theoretically consistent correlations were found, such as all USE measures being significantly positively correlated with each other, and self-efficacy and coping both being significantly negatively correlated with non-academic goal setting. These findings are addressed against the previously reviewed literature in the discussion section that follows.

Table 5 Non-parametric correlations between embeddedness, academic, and retention variables

Discussion

The development of a student embeddedness scale was created in response to the call for a measure of university student embeddedness (Larkin et al., 2013). Study one showed the University Student Embeddedness (USE) scales of Fit, Links (peers), and Links (staff) to demonstrate evidence of acceptable measurement properties. The item development process and SME evaluation provided support for content validity, specifically that the developed items related to the intended embeddedness constructs and were perceived to be relevant to the prediction of retention. The developed eight-item Fit measure, 10-item Links (peers) measure, and seven-item Links (staff) measure all provided good evidence of acceptable measurement properties, including that each scale measured a unidimensional construct. The item invariance to gender, age, and social advantage/disadvantage, suggest that the scales would be appropriate for use with an Australian tertiary student sample. The results of the interviews and subsequent assessment by SMEs suggests that Fit, Links, and Sacrifices are all important elements in students’ decisions to remain enrolled at their universities or in their courses, although our attempt at creating a Sacrifice measure encountered difficulties, as outlined in the results.

Dissimilar to Morganson et al. (2015), who found embeddedness in the academic major to be more important than embeddedness with the wider university community when considering intent to stay, participants in Study One identified both aspects of embeddedness as important. Consistent with Morganson et al. (2015), value for the course and value for learning emerged as Fit factors important to the student experience. Although Morganson et al. (2015) classified access to desired resources and career opportunities as Sacrifice factors, in this study they emerged as factors that participants considered when developing their perceptions of course/university-person Fit. Consistent with Morganson et al. (2015), peer and staff relationships were both identified as Links factors important to the university experience. The measurement model findings in Study one part two suggested a distinction between peer-based Links and staff-based Links when measuring this facet of embeddedness.

One of the major Sacrifice themes identified in Study one were sunk costs, particularly the loss of financial and temporal investments. Sunk costs were also identified as an important sacrifice theme by Morganson et al. (2015). Unlike Morganson et al. (2015) who identified loss of prestige or esteem as core Sacrifice components, participants in this study were more concerned that leaving university would reduce their social opportunities. Forfeiting relationships with colleagues is aligned with existing definitions of Sacrifice in organizational contexts (Mitchell & Lee, 2001). Although the Sacrifice scale failed to meet the requirements of the Rasch model, it is recommended that future efforts to develop a similar scale develop items aligned with the themes (see Appendix 8.3) identified in this study.

Study two found mixed evidence for the utility of the University Student Embeddedness (USE) instrument. Only the Fit scale was found to significantly relate to stay intentions, although all scales showed significant relationships with academic related-skills and motivations (e.g., academic self-efficacy). These results provide support for the applicability of embeddedness constructs to a tertiary student context, and add to our understanding of what facilitates a positive university experience for students. Our results suggsted that the level of perceived similarity between a student and their university may relate to their retention decisions. These results are aligned with previous research (Larkin et al., 2013; Morganson et al., 2015) that Fit is related to retention decisions, although failed to replicate a relationship between retention and the other dimensions of embeddedness. The utility of the Fit measure suggests that alignment between the content of the course and the goals of the individual is pertinent to the student remaining with their university. Furthermore, students need to perceive alignment between their aptitudes, skills, and values with their tertiary institution. These preliminary results suggest that Bean’s (1983) organizationally-based model of the factors that embed an employee within their organization is also relevant to understanding attrition in universities. Specifically, the forward-focused aspect of Fit (i.e., that the course aligns with the desired career path and will enable the student to reach their goals), align with Bean’s model and findings that the perceived utility of the degree is associated with student attrition. Furthermore, these results expand upon recent research (Etzel & Nagy, 2016), that the Fit between the student and university is an efficacious predictor of retention. This study also extends the qualitative work by Morganson et al. (2015) that embeddedness is relevant to predicting retention in students, beyond those enrolled in STEM majors.

Limitations

A key limitation of the present study was that students’ self-reported intentions to stay at university were used as a proxy for actual retention. Research suggests that the correlation between intentions and actual retention, is somewhere between r = .23 and r = .60 (Larkin et al., 2013; Pleskac et al., 2011). Additionally, the use of a single item to assess retention, along with a negative skew in this variable indicating that most students intended to complete their degrees, restricted our ability to discriminate between participants high in stay intentions. Future efforts to establish the predictive validity of the USE should incorporate longitudinal data of actual student retention.

Context-specific factors warrant consideration regarding the applicability of the USE beyond the examined samples. Although the Australia-situated sample is discussed in the applications section that follows, the type of university (e.g., whether the university is a larger, research-oriented university) from which the participants were sampled may have had an implication on the effects of embedding factors on retention considerations. Although we did not gather data on the specific university in which participants were enrolled, it is possible that the type and teaching or research focus of universities may have been influential on the Links construct. For example, the Department of Education, Skills and Employment (2020) recorded differences in 2018 of full-time-equivalent staff-to-student ratios of 15.53 for Bond University, a non-profit teaching-focused university, in comparison to a ratio of 32.10 for Victoria University, a public research and teaching university. While class sizes are not necessarily indicative of the potential to form Links with educators and peers, potential heterogeneity of the sample regarding university context may have attenuated correlations. Similarly, perceptions of the Sacrifice construct may have varied depending on the availability of student places in universities within Australia, with the most prestigious universities belonging to Australia’s “Group of Eight.” These universities offer comparatively more-contested places, thereby potentially creating heterogeneity in the value of retaining a place within a university course. These heterogeneity considerations warrant further exploration in relation to the USE.

Applications

The developed scales provide a means for the advancement of literature into the role of embeddedness in student retention. By understanding the mechanisms that anchor students to their university, universities will be better placed to support and retain their students. The current results suggest that the Fit scale may be especially useful for universities to use in identifying students at risk of dropout. Early identification of “at risk” students may lead to the development of targeted interventions aimed at increasing student retention (Cassells, 2017). Providing students with relevant career advice and explicitly stating the link between course content and future desired career roles are likely to bolster student perceptions of Fit. Additionally, developing authentic assessment and providing opportunities for work integrated learning and work placements are also likely to better connect course content with student career aspirations (Jackson, 2016). In establishing student-course fit, it is also important that universities provide prospective students with realistic course previews so that students can make informed enrolment decisions (Buckley et al., 2004). University orientation weeks are particularly important for university staff to establish academic expectations with their students, and to provide students with information to allow them to ascertain whether their skills and interests align with the content of their courses.

Despite no significant relationship between stay intentions and Links, the obtained relationships between Links and academic-related skills and outcomes suggests that bolstering student ties with academics and their peers is likely to be an important aspect of fostering positive student experiences. Links between students can be advanced through encouraging and providing funding and resources to student groups and associations. Links between students and staff, particularly informal interactions, can be fostered through networking and career/industry events. In terms of inter-cultural applicability, we would expect Fit and Links scales to posit similar results outside of the Australian context. Perceived Sacrifices may differ inter-culturally depending on perceived ease of transfer across institutions or based upon student fee contributions (i.e., students may be less likely to view leaving as a Sacrifice if they pay no fees). Hence, future efforts to develop a Sacrifice scale may like to consider the role of the university context in perceived Sacrifices, and assess the inter-cultural generalizability of the USE, particularly when understanding retention in non-Western higher education institutions. Indeed, due to the homogenous nature of the samples employed in this study, differential item functioning was not able to be modeled for cultural/racial differences. It is recommended that further validation ascertains whether the items are invariant to cultural differences.

Conclusions

These studies provide evidence for the validity of the Fit and Links scales of the USE, and suggests that student-university/course Fit may be particularly important to further our understanding of the reasons students chose to remain enrolled in their courses and at their universities. This research makes a unique contribution to the student retention literature by providing psychometrically sound tools for the identification of students at risk of leaving their studies. The development of this measure is an important step in applying embeddedness to a student context, and subsequently facilitating the student retention literature to capitalize on the vast research into workplace turnover. We encourage further validation of the USE, employing more objective measures of student retention, and validation in contexts beyond Australia to assess the applicability of this scale to other cultural groups and university systems.