Introduction

As the contemporary society is built on scientific knowledge, scientists, as a key input for scientific knowledge production, must be developed sustainably (Bozeman et al., 2001; Gu et al., 2018; Laudel & Glaser, 2008; Stephan, 2012; Yoshioka-Kobayashi & Shibayama, 2021). Ph.D. education is critical for training junior scientists (Cyranoski et al., 2011; Gould, 2015; National Research Council, 1998; Stephan, 2012), and considerable efforts have been made to implement an effective Ph.D. education systems (Altbach, 2007). The higher education literature has closely examined the operation of Ph.D. education, indicating substantial heterogeneity in training modes and outcomes between organizations and supervisors (Bastalich, 2017; Hockey, 1991; Mainhard et al., 2009; Marsh et al., 2002; Shibayama et al., 2015; Yoshioka-Kobayashi & Shibayama, 2021).

Although previous studies have contributed to our understanding of Ph.D. education, a bottleneck still exists. The quality of students who join Ph.D. programs significantly impacts the outcome of Ph.D. education (Van Ours & Ridder, 2003); therefore, students’ career choices at this early stage—whether to proceed to Ph.D. or not—are of crucial interest. Although several studies have investigated the conditions for students to pursue a Ph.D., their scope has mostly been limited to socioeconomic factors, such as demographics and family backgrounds (Borrego et al., 2018; Eagan et al., 2013; English & Umbach, 2016; Perna, 2004). This study adds to the literature by investigating factors that are more directly influence students’ future scientific performance. How such factors influence the students’ decision to pursue a Ph.D. is worthy of attention, since Ph.D. graduates are expected to engage in knowledge production. We first examine students’ individual attributes that contribute to their scientific performance, specifically their abilities and job motives. Second, we investigate the local environment in terms of academic institutions’ two primary functions, education and research. Because of the tension between these two functions, supervisors and labs have different priorities and invest in these functions differently (Leisyte et al., 2009; Shibayama et al., 2015). This results in the heterogeneity of students’ experiences, which impacts their later performance. Hence, we examine both the recipients (students) and the providers (supervisors and labs) of Ph.D. training and analyze how they interactively shape students’ career choices.

This aim poses two empirical challenges. First, local environments vary across programs and supervisors (Hockey, 1991; Shibayama et al., 2015). However, identifying a specific program of interest for each student is difficult, and most previous studies had to overlook the heterogeneity of local environments. Second, students’ individual attributes, particularly abilities, are typically observable only after they begin working on research projects, at which point those who do not pursue Ph.D. degrees are excluded. We address these issues by exploiting our unique empirical design in the context of Japanese graduate education.

Japan is one of the leading producers of science in terms of scientific publications and awarding of Ph.D. degrees, though recent budgetary constraints have negatively affected the country (Shima, 2012, 2017). For our study purposes, the Japanese context provides suitable conditions for the following reasons. In science, technology, engineering, and mathematics (STEM) fields in Japan, a relatively high proportion of undergraduate students pursue master’s degrees, and Ph.D. programs are frequently extensions of master’s programs. Notably, the majority of Ph.D. students are recruited from master’s graduates of the same lab (Kato & Chayama, 2010). As master’s students participate in research activities during their master’s programs, they become acquainted with the local environment of the specific lab. In this context, we investigated pairs of master students, who could be considered potential Ph.D. candidates, and their supervisors. We conducted questionnaire surveys of students and of supervisors respectively when the students were about to graduate and had already decided on their career paths (Ph.D. or not). The students were asked about the local environment and their job motives, and the supervisors were asked to evaluate their students’ abilities. The data collected suggested that students’ attributes and the local environment interactively shape the students’ decision to pursue a Ph.D.

This paper is organized as follows. The next section reviews prior studies concerning the determinants of progression to the Ph.D. level. The section that follows describes our empirical context and outlines the data and methods. Then, we present our econometric analysis results. Finally, we present a summary of our findings and discuss their implications.

Literature review

A body of higher education literature has studied the conditions for students enrolling in Ph.D. programs. The career decision to pursue Ph.D. programs can be viewed as a match between a student and the local environment. Accordingly, determinants of Ph.D. progression can be grouped into the attributes of the two sides. In this regard, most previous research focused on the students’ attributes. They investigated undergraduate students’ career intentions (their desire or intention to pursue a Ph.D.), focusing on several socioeconomic factors. For example, a few studies found that female students are less likely to pursue Ph.D. degrees (Perna, 2004), and that propensity to Ph.D. progression varies by ethnicity (Eagan et al., 2013; Perna, 2004). Students’ career choices are also influenced by their family’s educational backgrounds and financial status (Eagan et al., 2013; Kallio, 1995; Perna, 2004).

Student’s attributes

Although these studies contribute to our basic understanding on the determinants of Ph.D. progression, they only provide a partial explanation. In particular, Ph.D. graduates are expected to become scientists and create scientific knowledge; hence, it is of particular interest to examine how their characteristics conducive to scientific performance shape their career choices. Previous research has suggested that individuals’ various skills and personal traits are related to scientific performance. For example, a basic understanding of the discipline is essential (Bozeman et al., 2001; Clark, 1984). Scientific research needs not only theoretical knowledge but also technical skills such as craft skills for experiments (Bozeman et al., 2001). On top of such domain-specific knowledge, domain-unspecific skills, such as creative thinking, are critical (Amabile, 1988; Shibayama & Wang, 2020). Scientists may also need managerial skills since scientific research is frequently team-based and requires coordination among members (Pearson & Brew, 2002).

In this regard, previous studies on Ph.D. progression investigated only generic abilities such as SAT and GRE scores and grading scores during undergraduate programs, and found a generally positive impact (Eagan et al., 2013; English & Umbach, 2016; Jung & Lee, 2019; Perna, 2004; Walpole, 2008). A few studies only indirectly investigated abilities related to scientific research. For example, Borrego et al. (2018) suggested that students’ self-efficacy in research activities influences progression to graduate programs in US engineering schools. Although at later career stages after Ph.D. completion, the literature on academic careers corroborates the link between research abilities and academic career choice. Studies found that students’ perceived ability or performance indicators (e.g., number of publications) are positively associated with the decision to stay in academia (Conti & Visentin, 2015; Roach & Sauermann, 2010).

Another individual attribute that is known to be conducive to scientific performance is motive. Ph.D. studies, and scientific research in general, require a consistent commitment to a specific research agenda. Thus, students’ motives must align with the direction of the work. This is especially important in activities that require creativity and autonomy, such as scientific research (Amabile, 1988). In this regard, a few studies looked into students’ job motives as a determinant of Ph.D. progression (Eagan et al., 2013, Jung & Lee, 2019; Walpole, 2008). They found that progression to Ph.D. can be driven by intrinsic factors, such as interest in intellectual work, preference for an autonomous work environment, and contribution to society, as well as extrinsic factors, such as employment prospects.

Local environment

The factors discussed above are important when students decide whether or not to pursue Ph.D. degrees. However, students also have to make more specific choices regarding the local environment for the right supervisor and the right laboratories (Jung & Lee, 2019). This is due to the fact that Ph.D. education is not uniform and can vary even within the same program (Hockey, 1991; Marsh et al., 2002; Shibayama et al., 2015). Previous studies on Ph.D. progression provided only limited knowledge in this regard because they focused primarily on the first decision of whether or not to pursue a Ph.D. and ignored the heterogeneity of the local environment. The present study considers labs or research teams, which are responsible for students’ research training, as a unit of the local environment, and we highlight their heterogeneity in the two main functions of academic institutions, i.e., education (training) and research. Supervisors and labs have different priorities in education and research due to the tension and trade-off between the two functions (Leisyte et al., 2009; Shibayama et al., 2015), which considerably changes students’ experiences.

Local training environment

In terms of education, the local environment differs regarding the training capacity—the lab’s resource basis on which students’ knowledge and skills are developed. To begin with, a substantial variation exists in the frequency of communication between students and supervisors, and how much effort supervisors invest in student training (Hockey, 1991; Shibayama & Kobayashi, 2017). A more qualitative heterogeneity is suggested in supervisors’ mentoring styles (Hockey, 1991; Mainhard et al., 2009; Marsh et al., 2002; Shibayama et al., 2015). Mentoring styles can be dissected from various angles, one of which categorizes career development and psychosocial support as two functions of mentoring (Kram, 1985). For students’ career development, supervisors coach students, assign challenging assignments, and expand their network, whereas supervisors develop students’ sense of professional self and provide counseling and role modeling for psychological support. Ph.D. supervisors give varying weights to these aspects (Paglis et al., 2006; Tenenbaum et al., 2001). Another aspect of Ph.D. training highlighted in the literature is the autonomy (as opposed to control) that students are given (Kam, 1997; Shibayama, 2019; Wichmann-Hansen & Herrmann, 2017). Ph.D. education heavily relies on the learning-by-doing by engaging students in actual research projects. Because students typically begin with limited skills in conducting scientific research, guidance and control under supervisors are required. However, excessive control can be counterproductive, and granting a high level of discretion may be justified (Heinze et al., 2009; Lee et al., 2007). In fact, a few studies indicated variation in the level of autonomy that students are given in a variety of research tasks (Kam, 1997; Shibayama et al., 2015), implying that an autonomous environment can be beneficial for students’ learning but costly for supervisors (Shibayama, 2019; Wang & Shibayama, 2022).

Observing variations in Ph.D. training, the literature contends that an appropriate mentoring style depends on each student and that no single style is suitable for everyone (Hockey, 1991; Mainhard et al., 2009; Marsh et al., 2002). This implies that appropriately matching students and local environments is crucial, but only a few studies on Ph.D. progression have referred to the local environment (Gatfield, 2005). An exception includes Eagan et al. (2013), who argued that faculty mentorship and communication with graduate students are positively associated with students’ willingness to continue in graduate programs. Following Ph.D. progression, Castelló et al. (2017) examined the conditions for students to complete or drop out of a graduate program, suggesting that socialization within the local community increases the likelihood of completion.

Local research environment

The local environment also differs in the capacity of research—the lab’s resource basis on which research activities are conducted and new knowledge is created. Some labs invest more resources (time, budget, etc.) in research activities, are equipped with expensive devices, and publish more papers than other labs (Stephan, 2012). A local environment with higher research capacities can provide students with several advantages. For example, students are likely to produce more by being part of a productive research team (Carayol & Matt, 2004). This advantage in the initial career stage reinforces the future productivity for the cumulative advantage mechanism (Allison et al., 1982; Diprete & Eirich, 2006). Being supervised by reputed supervisors and graduating from prestigious programs also offer greater postgraduate career opportunities (Long et al., 1979; Miller et al., 2005). Thus, students are likely to prefer a local environment that has higher research capacities and a track record of excellent research achievement. Although direct empirical evidence is lacking, a few studies on career choices after Ph.D. completion corroborated this argument, finding that students supervised by high performers are more likely to stay in academia after Ph.D. graduation (Conti & Visentin, 2015; Roach & Sauermann, 2010). Meanwhile, a recent study provided counter-evidence that Ph.D. students may not pursue an academic career when supervised by highly cited supervisors and trained in labs with a strong research network (Broström, 2019).

Matching of students and local environment

We contend that students’ career choices are influenced by their own attributes and the local environment not only independently but also interactively. Concerning local research environments, a positive assortment is plausible, in which students with high abilities are matched with labs with high research capacities. In general, students prefer labs with higher research performance (Maher et al., 2020), and labs prefer students with greater abilities and commitment. Thus, a positive assortment can occur if an efficient selection mechanism (i.e., admission process) exists. However, a different mechanism could be at work. That is, labs with high research capacities may accept more students, not only the best students but also other students. It is in the lab’s interest to recruit many students rather than the brightest students, especially when research activities are labor-intensive (Freeman et al., 2001). In this scenario, low-ability students can be matched with high-capacity labs.

Similar arguments can be made in terms of the local training environment. In general, students should prefer labs with higher training capacities; however, students with different abilities may differently prioritize training and research capacities. In one plausible scenario, high-ability students perceive limited necessity for training and prioritize research capacities, whereas low-ability students perceive a high need for training and thus prioritize training capacities. As a result, low-ability students may be matched with high training-capacity labs. However, the priority can be reversed, for example, when students are under time constraint to produce research output. Low-ability students may be less confident in their performance and feel pressured to publish more, for which they may have to rely on the lab’s research capacities. Meanwhile, high-ability students are confident in their abilities and can thus focus on developing their skills for future growth. This scenario should match high-ability students with high training-capacity labs.

Methods and data

Empirical setting

Provider of graduate education

We used survey data in life and information sciences in Japan to conduct empirical research. Japan has approximately 800 universities, 400 of which offer Ph.D. programs. Based on their governing bodies, universities are divided into three categories: 86 are national, 94 are regional (of prefectures or cities), and 615 are private. National universities are the primary providers of both academic training and research among the three groups, whereas most private universities focus on undergraduate education. National universities enroll 65% of new Ph.D. studentsFootnote 1 and produce roughly half of all scientific papers (Shima, 2017). Among others, the seven pre-imperial national universitiesFootnote 2 have historically played a significant role (Kneller, 2007), with 36% of Ph.D. students and 43% of academic papers published as of 2020.Footnote 3

Structure of graduate programs

Most graduate programs in Japanese universities consist of a 2-year master’s program and a 3-year Ph.D. program. During their master’s program, students usually decide whether to pursue a Ph.D. In 2020, 48,000 students received STEM master’s degrees, with 8.6% pursuing Ph.D. studies. Many Ph.D. students continue to work in the same lab under the supervision of the same professor. In 2014, 62% of STEM Ph.D.s were recruited from master’s degree programs at the same university (Kato & Chayama, 2010). Master students conduct research during their master’s programs, which is often extended to Ph.D. thesis projects if they pursue Ph.D. degrees.

There is a similar continuity between undergraduate and master’s programs. Typically, undergraduate students decide whether or not to pursue master’s degrees during their undergraduate programs. Undergraduate students can join labs and begin research activities, which can be expanded to master’s and doctoral levels. A relatively high proportion of STEM students continue on to master’s degree programs. In 2020, 185,000 students received STEM bachelor’s degrees, with 23.8% continuing on to master’s programs. Approximately 76% of those master’s students attended the same university where they received their bachelor’s degrees (Hoshino et al., 2021).

It is common that students complete undergraduate and graduate programs without interruption. However, there is another path to a Ph.D.; specifically, after earning master’s degrees, students choose to gain work experience outside of academia and return to Ph.D. programs afterwards. In 2020, 35% of STEM Ph.D. students had previous employment experience, and this proportion has been increasing. In the following analysis, we primarily examine the first path to a Ph.D. but also look at the second path supplementarily.

Environment around Ph.D.

A Ph.D. degree is typically required for academic employment in Japan, as in many other countries, and Ph.D. programs in Japan are more focused on the development of academic scientists (as opposed to industry scientists). In fact, in 2015, 58% of STEM Ph.D. graduates remained in academia (Matsuzawa et al., 2018).

The financial basis for Ph.D. programs is worth noting. Ph.D. students in Japan do not always receive adequate financial support. According to a national survey in 2018, 55% of Ph.D. students received no financial assistance and only 10% received 1.8 million JPY per year, the amount considered necessary by the government for living expenses.Footnote 4 Though assistance is provided through various channels (e.g., teaching assistant, research assistant), the vast majority is publicly financed while private funding is rather uncommon. Students must also pay tuition, which in national universities is 536,000 JPY per year. Although tuition may be waived in some cases, most Ph.D. students pay the full amount (Kawamura & Hoshino, 2022).

Insufficient financial support for Ph.D. students has been a major source of concern and is viewed as an important reason for students not to pursue a Ph.D. Furthermore, since the 2000s, the academic sector as a whole has faced increasing budgetary constraints, and the condition of academic jobs has been deteriorating (Shima, 2012). This resulted in lower publication productivity (Shima, 2017), although Japan remains a major producer of scientific knowledge. Because academia is the primary employer of Ph.D. graduates (Kawamura & Hoshino, 2022), the poor career prospect has made academic careers unappealing to younger generations (Arimoto et al., 2019), and the enrollment rate from master to Ph.D. programs has been declining (9.3% in 2018 compared with 15.7% in 1998).

Data

We draw on survey data from pairs of master’s students, who are considered potential Ph.D. candidates, and their supervisors. The continuation between master’s and Ph.D. programs provides a few advantageous conditions for this research. First, because master’s students are the primary source of Ph.D. candidates in the same lab, we can reasonably expect a pair of master students and their supervisors to continue if students choose to pursue Ph.D. degrees. This is an advantage over previous studies, which did not identify such a pair (Eagan et al., 2013; Perna, 2004). Second, the continuation of two programs allows master’s students to base their career choices on the local environment they encountered during their master’s programs. Third, it enables master supervisors to assess their students’ abilities based on their research activities during their master’s programs. Thus, by investigating these pairs, we can address the limitations of previous studies on the determinants of Ph.D. progression (Eagan et al., 2013; Perna, 2004).

Survey design

We conducted questionnaire surveys of student–supervisor pairs in two steps. First, we sent a questionnaire to a random sample of supervisors and inquired into various aspects of lab environment. The supervisors were then asked to choose up to three students in the second year of the master’s program under their supervision and assess each student’s scientific abilities. Finally, the supervisors were asked to send a survey invitation to the chosen students. After accepting the invitation, the students were questioned about their career choice, perception of the lab environment, personal background, and so on. This two-step design is intended to protect the students’ anonymity, which we deemed critical because the survey included sensitive information such as supervisor assessments of student abilities and student evaluations of lab environment.

We created the questionnaire items based on previous studies (Borrego et al., 2018; Eagan et al., 2013; English & Umbach, 2016; Kallio, 1995; Kam, 1997; Paglis et al., 2006; Perna, 2004; Ro et al., 2017; Shibayama et al., 2015; Tenenbaum et al., 2001) with taking the specific context of Japan into consideration. We also conducted unstructured interviews with 11 current and former graduate students. A pilot survey was conducted to test the developed questionnaire. The finalized survey was distributed in November and December of 2019. The academic year in Japan begins in April and ends in March. Thus, at the time of the survey, student respondents were nearing the end of their master’s program (a few months before graduation), and almost all of them had decided on their career path.

Sample

We randomly selected a sample of supervisors in the following steps. First, we purposively chose 16 research-intensive universities where most faculty members supervise both master and Ph.D. students.Footnote 5 Second, we decided to focus on the fields of life and information sciences. We found 138 schools related to these fields at the chosen universities.Footnote 6 In comparison to more academically oriented fields, these fields provide good employment opportunities not only in academia but also in industries for postgraduate students. Thus, we anticipated that master students’ career choices were less predetermined and shaped during their master’s programs. Third, we compiled a list of faculty members at these schools using publicly available information (e.g., school websites). We surveyed full professors and associate professors, but not assistant professors, who are less likely to supervise students. The list includes 2176 professors in life sciences and 850 professors in information sciences. Finally, we randomly chose 1300 and 700 professors from each discipline.Footnote 7 We then removed 160 professors from the sample due to retirement and other practical reasons. We received 465 responses (331 in life sciences and 134 in information sciences) from the remaining 1840 professors (the response rate = 25.3%).Footnote 8

For the student survey, we asked each supervisor respondent to select up to three master students. If a supervisor had more than three students, we asked them to choose three students in alphabetical order, so that the selection was as random as possible. We also requested that non-Japanese students be excluded in order to reduce the heterogeneity of personal backgrounds that are not of primary interest.Footnote 9 Invitation to the student survey was sent to 644 students (1.4 students per supervisor), and we received responses from 203 students (the response rate = 31.5%).

Measures

Career choice

We prepared two variables related to students’ career choices. The main career variable is the choice made immediately following master’s programs. A dummy variable is coded 1 if a student plans to pursue a Ph.D. program immediately after completing his or her master’s degree, and 0 otherwise (PhD after MS). Another career variable is the students’ intention to return to Ph.D. programs in case they did not plan to pursue a Ph.D. immediately after master’s programs. A dummy variable is coded 1 if a student expressed interest in such a career path, and 0 otherwise (PhD after employment).

Student’s abilities

We assessed students’ abilities from various perspectives. First, we asked the supervisors to rate each student’s overall research ability on a four-point scale: (3) within the top 10%, (2) top 25%, (1) top 50%, and (0) bottom 50%, among all students they had previously supervised (overall ability). The supervisors then rated each student in terms of (a) base academic ability, (b) technical skills, (c) logical thinking, (d) originality, and (e) managerial skills in order to break down abilities into different aspects. Each student was evaluated on these dimensions on a dichotomous scale (having each skill or not). Base academic ability is expected to capture students’ generic abilities. Technical skills are more domain-specific skills for experiments, programming, and so on (Bozeman et al., 2001; Clark, 1984). We include logical thinking and originality as domain-unspecific skills conducive to creativity (Amabile, 1988; Shibayama & Wang, 2020). Finally, managerial skills refer to potential abilities to supervise a research team (Pearson & Brew, 2002).

Job motives

Based on previous literature (Eagan et al., 2013; Mcculloch et al., 2017; Paulsen & Toutkoushian, 2008; Walpole, 2008) and our interviews, we developed 12 questionnaire items on social contribution, work responsibility, intellectual stimulation, financial returns, stable employment, and so on (Table 5). Students responded whether each item is important in their career choices or not on a dichotomous scale. We then used a factor analysis to extract four factors corresponding to (a) intellectual stimulation, (b) social contribution, (c) financial gain, and (d) work–life balance. These variables have eigenvalues greater than one and account for 54% of the total variance.

Local training environment

We prepared several measures for the local training environment. First, we asked students to evaluate their supervisors’ mentoring styles. Based on previous surveys (Paglis et al., 2006; Tenenbaum et al., 2001), we developed six items concerning psychosocial support and career development (Table 6). Students responded whether each item applies to their supervisors on a dichotomous scale. We used a factor analysis to extract two factors corresponding to psychosocial support and career development (Kram, 1985). The two factors have eigenvalues greater than one and account for 49% of the total variance. We also assessed the level of autonomy that students had in different research tasks. Following existing survey instrument (Kam, 1997; Shibayama et al., 2015), we asked the students whether they had substantial responsibility in each task: (a) setting a research topic, (b) formulating a hypothesis, (c) planning research methods, (d) monitoring progress, (e) reviewing prior studies, and (f) writing a paper. Students responded in a dichotomous scale, and we calculated the ratio of tasks for which the respondent had high responsibility (autonomy). We also asked the students how often they received supervision from their supervisors (Shibayama & Kobayashi, 2017), and a dummy variable is coded 1 if supervision occurred at least once a week and 0 otherwise (Frequent supervision).

Local research environment

We prepared three measures for the local research environment (Stephan, 2012). First, we created a variable for the output of research activities by counting the number of papers authored by the supervisor and taking its logarithm (ln(Supervisor #pub)). Second, for the input of research (Toutkoushian & Bellas, 1999), the supervisor survey asked how many hours were invested in research-related activities per week on a seven-point scale: (1) less than 10 h, (2) 10–20 h, (3) 20–30 h, (4) 30–40 h, (5) 40–50 h, (6) 50–60 h, and (7) more than 60 h (Supervisor research time). Third, as a proxy for scientific reputation in the Japanese context,Footnote 10 the survey asked whether the lab had been involved in international collaborations in the previous 3 years, and a dummy variable is coded 1 if it had been and 0 if it had not (international collaboration).

Control variables

We controlled for family backgrounds following previous studies (Eagan et al., 2013; English & Umbach, 2016; Perna, 2004). If at least one parent had a postgraduate degree, a dummy variable is coded 1, and 0 otherwise (Parent postgrad). Another dummy variable is coded 1 if at least one parent had been employed for R&D jobs in industry or research and education in academia (Parent R&D job). We controlled for other institutional and personal factors. First, we measured the lab size by the number of faculty members (#Staff) and graduate students (#Student). Second, students were asked whether their master’s thesis was based on applied or basic research on a five-point scale from (1) mostly basic to (5) mostly applied (Applied research). This is because applied orientation is associated with industrial employment (Agarwal & Ohyama, 2013). Third, we controlled for how many years the supervisor has been in an academic career (Supervisor tenure), mainly because old supervisors (who are about to retire) are unlikely to accept Ph.D. students. Fourth, because career options differ across disciplines, a dummy variable is coded 1 for life science labs and 0 for information-science labs (Life science). Fifth, because Ph.D. progression is more likely at higher-ranked universities, a dummy variable is coded 1 for the seven top-tier universities and 0 for the rest (Top-tier univ).Footnote 11 Finally, we controlled for the student’s gender by assigning a dummy variable of 1 to female students and 0 to male students (Female).

Table 1 displays the descriptive statistics and correlation matrix for the variables.

Table 1

Results

Determinants of Ph.D. progression

First, we investigate the determinants of Ph.D. progression immediately following master’s programs. We find that 16% of students planned to pursue Ph.D. programs while 80% were employed (mostly in the private sector) and 3% had not decided on their careers. Table 2 shows the results of probit regressions predicting the likelihood of students enrolling in Ph.D. programs (PhD after MS). Because we are particularly interested in students’ abilities, we ran two models: model 1 with the overall ability and model 2 with the breakdown of ability measures. As the two models indicate similar results, we primarily explain model 2 unless otherwise stated.

Table 2 Prediction of progression to PhD after MS

Student’s abilities

We first examine the direct relationship between students’ abilities and their career choices. Figure 1A shows that high-ability students are more likely to pursue a Ph.D. (p < 0.01). This is in line with previous studies (Eagan et al., 2013; English & Umbach, 2016; Perna, 2004). Model 1 consistently exhibits a positive but insignificant coefficient of overall ability (b = 0.172, p > 0.1). Model 2 breaks down the overall ability into several dimensions, revealing a weakly negative coefficient for base academic ability (b =  − 0.650, p < 0.1), a positive coefficient for technical skills (b = 0.749, p < 0.05), and a strongly positive coefficient for originality (b = 1.103, p < 0.01). Because originality is regarded as a critical trait for scientists (Hagstrom, 1974; Merton, 1973), high originality students’ strong preference for academic careers is consistent with job requirements. Technical skills have a positive effect possibly because they are required for research activities in the selected disciplines (life and information sciences). A negative effect for base academic ability is somewhat unexpected,Footnote 12 possibly due to the decline in popularity of academic careers in Japan.

Fig. 1
figure 1

Progression to PhD

Job motives

Of the four job motive variables, only the motive for intellectual stimulation has a significantly positive coefficient (b = 0.388, p < 0.05), which is consistent with previous studies (Eagan et al., 2013; Walpole, 2008). The motive for financial gain has a negative coefficient (b =  − 0.329, p < 0.05), which could be attributed to the limited financial support for Ph.D. students and the high uncertainty of academic careers in Japan (Hoshino et al., 2021). We also find that the motive for social contribution has a negative effect (b =  − 0.555, p < 0.01). Thus, students appear to consider that skills trained in Ph.D. programs and academic jobs contribute to society only insufficiently.

Local environment

All the three variables of local research environment are found positively associated with Ph.D. progression: supervisor #pub (b = 1.069, p < 0.001), international collaboration (b = 0.734, p 0.05), and supervisor research time (b = 0.306, p < 0.01). We also find largely positive effects of the local training environment. Psychosocial support (model 1: b = 0.278, p < 0.1) and career development (b = 0.345, p < 0.05) have both positive coefficients. In terms of resources for training, frequent supervision is found to be positively associated with Ph.D. progression (b = 1.046, p < 0.05). These findings are not surprising in that students should prefer to learn in labs with higher capacities for research and training.

Interaction of student abilities and local environment

To further investigate how students’ career decisions are influenced by their abilities, we ran regressions with the same set of independent variables interacted with students’ ability measures. To facilitate interpretation, we divided the sample into high-ability and low-ability student groups, and we illustrated the marginal effects of focal independent variables for each group (Fig. 2). In this analysis, we use three ability variables: (A) overall ability, (B) base academic ability, and (C) originality. Overall ability is expected to most accurately capture the ability relevant to research activities. While base academic ability may capture generic abilities, originality is especially important in scientific research.

Fig. 2
figure 2

Marginal effect of local environment: breakdown by student’s abilities. Note. Two-tailed test. ***p < 0.001, **p < 0.01, *p < 0.05, p < 0.1. The statistical significance of each local environment variable is presented to the right of each bar, and the statistical significance of the difference between low and high ability groups (blue and orange bars) is presented further to the right. For this analysis, we ran regressions with the same set of variables as in Table 2, except that we added an interaction term between an ability variable and a selected local environment variable. To avoid multicollinearity, we tested one interaction term at a time (we did not include interaction terms for all independent variables simultaneously). After estimating the model, we computed the marginal effect of the selected independent variable for low and high ability groups, respectively. When a selected independent variable (X) is a dummy variable, the marginal effect is computed as Prob.(PhD after MS = 1 | X = 1) − Prob.(PhD after MS = 1 | X = 0). To simplify the analysis, we dichotomized overall ability (A): low: overall ability = 0 or 1 and high: overall ability = 2 or 3

In terms of the local research environment, the three panels of Fig. 2 show a similar pattern. The effects of publication performance, international collaboration, and supervisor’s research time are stronger in the low-ability group than in the high-ability group. Thus, low-ability students are admitted to Ph.D. programs if the local environment demonstrates high research capacities. High-ability students are also influenced by the local research environment, but the magnitude of the effect seems smaller than that for low-ability students.

Figure 2 also shows a similar pattern in terms of the local training environment. The effect of the local training environment appears to be stronger for high-ability students than for low-ability students. High-ability students planned to pursue a Ph.D. if the local environment has high training capacities, which contrasts with the finding that low-ability students are influenced more by local research environment. One possible explanation is that low-ability students prioritize short-term research performance based on the lab’s research capacities, whereas high-ability students prioritize learning for long-term performance.

Interaction of job motives and local environment

Similarly, we investigate how students’ motives influence their career choices by interacting the same set of independent variables with job motive measures. We split the sample into high-motive and low-motive groups and present the marginal effects of focal independent variables for each group (Fig. 3). Specifically, we use motives for (A) intellectual stimulation, (B) social contribution, and (C) financial gain as they have a significant impact on students’ career choice (Table 2).

Fig. 3
figure 3

Marginal effect of local environment: breakdown by motives. Note. Two-tailed test. ***p < 0.001, **p < 0.01, *p < 0.05, p < 0.1. The statistical significance of each local environment variable is presented to the right of each bar, and the statistical significance of the difference between low and high motive groups (blue and orange bars) is presented further to the right. For this analysis, we ran regressions with the same set of variables as in Table 2, except that we added an interaction term between a motive variable and a selected local environment variable. To simplify the analysis, we dichotomized motive variables (lower or higher than the mean). To avoid multicollinearity, we tested one interaction term at a time (we did not include interaction terms for all independent variables simultaneously). After estimating the model, we computed the marginal effect of the selected independent variable for low and high motive groups, respectively. When a selected independent variable (X) is a dummy variable, the marginal effect is computed as Prob.(PhD after MS = 1 | X = 1) − Prob.(PhD after MS = 1 | X = 0)

Figure 3A suggests that high-motive students in terms of intellectual stimulation are more likely to pursue a Ph.D. if the local environment demonstrates high training capacities, whereas low-motive students are more likely to pursue a Ph.D. if the local environment demonstrates high research capacities. Thus, students motivated by intellectual stimulation seem to prioritize training input but are not influenced by research capacities per se. This pattern resembles the observed contrast between high-ability and low-ability students. Figure 3B shows a similar pattern for the motive for social contribution, though the difference between high-motive and low-motive students is less clear. Finally, Fig. 3C depicts an opposite pattern. That is, students motivated by financial gain are more likely to pursue a Ph.D. in a local environment with high research capacities, whereas those who are not interested in financial gain are more likely to pursue a Ph.D. in a local environment with high training capacities.

Determinants of Ph.D. progression after employment

Of the students who chose not to directly pursue Ph.D. degrees, we find that 51% are interested in returning to Ph.D. programs after working elsewhere (mostly in the private sector). Using probit regressions, we examine the determinants of PhD after employment (Table 3). Model 1 includes the same set of independent variables as in Table 2. However, the career choice in this analysis is conditioned on the first choice of not proceeding to Ph.D. programs immediately after master’s programs. Thus, we additionally draw on the Heckman selection model to incorporate these two stages of career choices (model 2). The outcome equation (the left column) is largely consistent with the simple probit model (model 1), whereas the selection equation (the right column) is consistent with the probit model in Table 2 (opposite signs). We also used the ability breakdown measures to run the selection model (model 3).

Table 3 Prediction of progression to PhD after employment

In terms of students’ abilities, model 2 shows a significant coefficient of overall ability (b = 0.498, p < 0.001), implying that high-ability students are more interested in returning to Ph.D. programs, even if they decided not to proceed to Ph.D. programs immediately. Model 3 breaks down ability measures, indicating a weakly significant coefficient for originality (b = 0.664, p < 0.1). Thus, students with originality are more likely to pursue Ph.D. degrees both immediately after master’s programs and after employment.

As to the job motives, model 3 shows that intellectual stimulation has a positive effect (b = 0.565, p < 0.001) and that social contribution has a negative effect (b =  − 0.234, p < 0.1). This is consistent with the findings for Ph.D. progression after master’s programs (Table 2). The negative effect of social contribution is interesting because Ph.D. after employment is supposed to contribute to linking society and academia.

In terms of the local environment, model 3 shows that publication performance has a positive effect (b = 0.433, p < 0.05), suggesting that students from productive labs are more interested in returning to PhD programs. Other variables of the local environment turn insignificant possibly because students can return to Ph.D. programs in different labs and the local environment for master’s program becomes less relevant.

Robustness check—timing of career choice

Some students make career decisions even before joining the lab. Figure 1B illustrates students’ career plans during their first–second years of undergraduate programs and their actual career choices. While 31% of the students did not have a clear career plan during their undergraduate years, the remainder had either a positive or negative opinion about pursuing a Ph.D.

In fact, 28% of the students thought a Ph.D. was “undesirable” during their undergraduate programs and did not pursue it, while 2% thought of Ph.D. progression as “the best option” and did pursue it. These students may have made career decisions without being influenced by the lab’s local environment, and it is possible that these students chose labs based on their career intentions, which could bias our analysis. To clarify the causal link, therefore, we ran additional regression analyses after excluding those who made a career choice during their undergraduate years and followed it, finding largely consistent results (Table 4).

Table 4 Sub-sample analysis excluding students with early career decision

Discussions and conclusions

Ph.D. training is an important mechanism for developing scientists who contribute to scientific knowledge production, and implementing an effective Ph.D. education system has been of scholarly and policy interest (Cyranoski et al., 2011; Gould, 2015; NRC, 1998; Stephan, 2012; Yoshioka-Kobayashi & Shibayama, 2021). As the outcome of PhD education is affected by the quality of students who proceed to Ph.D., this study investigates how students’ career choices at this initial stage are shaped interactively by students’ attributes and the local environment of labs, drawing on the unique context of the Japanese graduate education system.

Our survey data based on master students and their supervisors suggest several findings. First, we observe that students’ abilities influence their career choices. Students with high originality and technical skills are more likely to progress to Ph.D., whereas those with high base academic abilities are less likely. Unlike previous studies (Eagan et al., 2013; Perna, 2004), this study suggests that different types of abilities have varying effects on students’ career choices. The findings also show that students’ job motives influence their career choices. The negative effects of motives for social contribution and financial gain may be unique to the Japanese context. Second, the results show significant effects of local environments. Students are more likely to pursue Ph.D. degrees in local environments with high research and training capacities. Third, and most importantly, the impact of the local environment varies between high-ability and low-ability students, as well as between high-motive and low-motive students. Decision to proceed to Ph.D. is shaped interactively by students’ attributes and local environments. Fourth, we examine PhD progression after employment and find a similar set of determinants.

Our findings offer a few implications for the practice of higher education. The positive association between originality and Ph.D. progression suggests that students with preferable skillsets are indeed selected for the academic sector. However, the fact that students with higher base academic abilities are less likely to pursue a Ph.D. is a cause for concern. In the Japanese context, academic jobs are losing popularity due to uncertain career prospects (Arimoto et al., 2019), which may drive away talented students. The findings also suggest that improving the local environment is an effective way to attract students. This can be accomplished by investing in the training environment (e.g., more supervision time) or by strengthening research capacities (e.g., more publications). Importantly, students with different attributes are influenced by their local environments differently. While students with high abilities are attracted to local environments with high training capacities, students with low abilities are drawn to local environments with high research capacities. Similarly, students seeking intellectual stimulation are drawn to labs with high training capacities. Therefore, to recruit high-ability and high-motive students to Ph.D. programs, the lab’s training capacities must be reinforced.

Our findings may be subject to the specificity of the empirical context, since graduate education systems and academic career systems differ across countries and scientific disciplines. For example, students’ immobility between Ph.D. programs and master’s programs is observed in some Asian and European countries (Gould, 2015) but may be uncommon in other countries. In the latter case, our results may not apply because students cannot experience the local environment and supervisors cannot directly assess students’ attributes. Furthermore, the long-standing poor career prospects of Japanese academics are noteworthy, which probably influenced our results regarding motives. For example, students may respond to the local environment more strongly due to weak pecuniary incentives. Nonetheless, stricter budget constraints for higher education and challenging career opportunities are becoming more common in other countries, including the USA and some European countries (OECD, 2021; White, 2021). In these contexts, a similar mechanism may become more common. Finally, although direct progression to Ph.D. from master’s programs is common in Japan, another route may be more important in other contexts. We examined Ph.D. progression after employment based on students’ future plans, but a more robust analysis based on actual career choices is required.