1 Introduction

Every goal has subgoals—tasks that must be completed in order to achieve the goal. In undergraduate education, a critical goal for the student is degree completion. The student and their advisor tracks progress toward that goal by ensuring that required and elective courses are completed. In graduate education, degree progress requires course completion but also achieving various milestones like filing a program of study, passing a preliminary exam, and proposing and defending a thesis. Successful programs ensure that students move through these milestones in a timely manner, and tools to track and assist this process are thus welcome additions. Excellent graduate educators go beyond the mere tracking of individuals by (a) identifying markers of success that drive educational practices, (b) ascertaining barriers (e.g., advisors, departmental policy, lack of funding) that slow the pace or likelihood of milestone completion, and (c) intervening to maximize the timely completion of the degree. To enable these interventions, we created a milestones dashboard to support department and college leaders in their quest for excellence in graduate education.

Prior research on the completion of degrees at the undergraduate level has focused on student preparation to complete the required coursework [1,2,3]. Much of the focus has been on STEM (science, technology, engineering, and mathematics) fields where degree completion rates are lower and the need for graduates is higher. Low grades in specific courses required for the degree (e.g., chemistry, calculus) and low overall grade point averages are critical factors in predicting degree completion or the transfer to a new major.

Graduate student success, however, is much less driven by course grades. The more selective admission process in many masters and doctoral programs eliminates the lower tail of the GPA distribution thus ensuring that the vast majority of graduate students have the aptitude to receive good course grades in graduate school. Indeed, a comprehensive survey of doctoral students at major research universities revealed that 80% agreed or strongly agreed that they could complete their graduate program with no trouble [4]. Not completing a graduate program is more likely to be driven by failing to achieve specific milestones (e.g., passing qualifying or preliminary exams, completing a thesis), financial challenges, and poor relationships with a thesis advisor, among others [5,6,7]. Furthermore, graduate students are less likely than undergraduates to simply switch majors if they are not thriving due to the limited overlap in curriculum across graduate programs and instead will simply stop pursuing a graduate degree. Hence, milestone completion and not merely good grades is critical to finishing a graduate degree. This is particularly true for the thesis-option Master’s degree and the doctoral degree because their additional milestones create more opportunities to stumble.

Many graduate programs are relatively unstructured, and ensuring that program requirements are met often falls on the shoulders of the graduate student and their thesis or dissertation advisor. Unfortunately, many of these thesis and dissertation advisors have incomplete knowledge regarding when milestones should be met, which courses are required, and what restrictions exist regarding the courses that can be applied to Master’s versus PhD programs. This outcome is in large part driven by advisors focusing on the students’ scholarly productivity rather than their coursework and milestones. The inattention to requirements creates delays in many students’ timely completion of their graduate degrees, and these delays can create bitterness toward an advisor, program director, department leader, and the university. As a consequence, improving the experiences of graduate students by ensuring the timely completion of their degrees should play a central role in their education [8]. The decentralized nature of graduate education is a barrier to achieving this goal because most solutions must be implemented at the advisor or departmental level [9, 10].

The advisor-advisee relationship is especially critical to the success of doctoral students, both positively and negatively [6, 7, 11,12,13,14]. The advisor can help a student stay on track, develop their research agenda, provide one-on-one training, secure financial resources, facilitate relationships with other scholars, and provide career guidance, inter alia. Conversely, an advisor can not only fail to provide those benefits but can also undermine degree progress and career success by keeping a student longer to facilitate the advisor’s research goals, undermine relationships with other scholars, and provide poor training and bad advice.

The goal of the present project was to create a decision aid that provides information to those overseeing graduate student progress toward their degree. Given that attrition in graduate programs can be quite high with doctoral completion rates often less than 60% (although these data vary considerably across institutions, e.g., [10, 15, 16]), a better understanding of the local factors impacting progress should drive the necessary changes. These changes can involve the admissions process, curriculum structure, program requirements, advising, financial support, and academic support. Although there are published studies that document the impact of various factors on time to degree and graduation, graduate education is highly decentralized in most American universities. This decentralization results in local factors predominating—Western University’s Psychology Department may be quite different from other psychology departments, and the local contextual variables may be driving or limiting its success. A single influential person in a department—the chair, graduate program director, research methods instructor, or an outspoken attendee of seminars—can have an outsized impact on student retention and progress. Thus, a decision aid that makes local conditions transparent can help administrators and advisors identify areas of concern or target programs worthy of praise.

2 Tracking progress: benchmarks versus milestones

Most graduate programs provide a clear layout of program milestones including the completion of a program of study, a first-year research project, proposing a thesis, or passing a preliminary or qualifying exam. However, some students and advisors are not aware of one or more milestones or their accompanying paperwork requirements thus resulting in a milestone being missed or a failure to complete a required form, creating a moment of crisis when eventually discovered. This lack of awareness coupled with the importance of hitting each milestone in a timely manner can result in students failing to progress at an acceptable rate.

To address this failure, it is helpful to provide benchmarks—expectations of when milestones should be completed. These benchmarks can be external to the department (e.g., the Graduate School recommends a time frame for filing a program of study or completing a Master’s thesis) or internal to the department (e.g., a timetable indicating when each milestone should be met to remain in good standing). However, many departments have little idea how their students are progressing relative to college- or university-wide norms because these data are not readily accessible. Although some institutions provide data on progress to the final milestone—years to graduation and graduation rate—these data are often not provided, not broken out by program or demographics, or lack contextualization (e.g., part- vs. full-time, funded vs. self-pay, domestic vs. international).

Prior research has demonstrated that achieving each milestone depends on student and department characteristics. Ampaw and Jaeger [17] provided evidence that the completion of the first year of a doctoral program is positively related to stronger academic ability, good performance in the first-year classes, being funded on an assistantship, being a full-time student, and, surprisingly, not entering with a Master’s degree. As students moved beyond this course-focused initial stage, new factors became prevalent. Having a graduate research assistantship (GRA) became more predictive of achieving candidacy than having a graduate teaching assistantship (GTA), being a part-time student was positively related to candidacy, but being a minority or having a previous Master’s degree tended to predict a lower likelihood of candidacy. Ampaw and Jaeger speculate that the emergence of minority status as a factor in this middle stage of degree progress (achieving doctoral candidacy) may be related to issues with social integration that lead to isolation during this critical developmental stage. In the final stretch of completing a doctoral dissertation, part-timers again were more likely to finish as were those funded by GRAs (but not GTAs). Across all three of these milestones, international students were more likely than domestic students to complete each milestone, a result consistent with published studies on time to degree [16, 18,19,20]. In examining our own data, international students completed their doctoral degrees 1.2 years faster than domestic students, but these differences varied considerably across colleges; the largest differences occurred for doctoral students in Education (2.2 years faster) and Arts and Sciences (1.5 years faster) with even larger differences at the department level. Again, local issues prevailed.

Various factors are likely to affect the achievement of any one of the milestones and will thus contribute to attrition during the progress to a degree. A mismatch between a student’s ability, interest, and understanding of their chosen field of study and the reality of work in that discipline often results in a student parting ways with their program [5]. This mismatch could become apparent early during first-year coursework or later when scholarly work outside of the classroom dominates a student’s time. Additionally, perceptions of a poor job market for those with Ph.D.s can undermine the motivation to complete the degree which can occur at any time during the program [5].

3 Building and using a decision aid

Data dashboards constitute a decision aid or decision support system—tools that facilitate individual or group decision making [21, 22]. Decision aids can provide the information in a graphic or textual form that will facilitate judgment and the decisions based on those judgments, or they may go even farther and make specific recommendations. For example, a decision aid could provide a graphical matrix of course information to assist students in choosing courses, or it could recommend specific courses based on pre-established factors and the weighting that each factor was assigned by the decision maker.

The advantages of decision aids include increasing the consistency of decisions, so that similar information produces similar decisions, and improving the quality of those decisions. Users of such aids, however, can grow to distrust an aid if the information or advice is questioned, or they can overtrust an aid by not recognizing its limitations [23]. Within the context of graduate education, these two mistakes can play out to the detriment of an organization. Distrust of the data provided by an aid can create inaction, uninformed action when one ignores the data, inefficiencies, cynicism, and resistance to others’ suggested change based on data. Overtrust or overreliance on an aid can create inattention to obvious errors, or a misunderstanding of contextual variables that limit generalizability [24, 25].

Decision aids have received considerable attention in the field of medicine where the complexity of information often overwhelms the ability of a medical professional to consistently reach the best conclusions [26,27,28,29]. In the medical field, there is often resistance to the use of decision aids because they infringe on the autonomy of physicians. Much less work has been done on the development of decision aids in higher education, although the recent flurry of activity in the creation of dashboards to view university data is a strong step in their availability [30,31,32]. Although increasingly available in support of undergraduate education, decision aid penetration into higher education institutions appears to be limited given the lack of trust in the data underlying these aids as well as by the distrust in the motives of those releasing the aids [33]. There is also evidence that higher education administrators are not well-equipped to leverage data—they are data rich but insight poor [34]– and there are concerns that administrators are pushing to mimic their peers without strong insights into the utility of the data they are gathering [35].

At our own university, the development of dashboards has been largely focused on data relevant to the undergraduate mission of the university. This approach is rational since the graduate student population (Master’s and Ph.D. students) represents only 21% of our student population. Although graduate student data are present in many of the dashboards created, the data needs of graduate programs are quite different from those of undergraduate programs. For example, the admissions process for graduate programs is driven by program-level decisions using criteria that are quite heterogeneous across the university. Once a graduate student arrives, they must be tracked more closely to ensure that they are hitting key milestones. This process is quite different from that for undergraduates where it is usually sufficient to simply confirm that a student has passed the courses required to complete a degree (exceptions include required clinical rotations or field experiences in some majors).

In the present manuscript, we will describe our efforts to better track graduate student progress across the university. First, we wanted to make student progress very transparent by providing graphical and tabular representations of progress toward milestones. Milestones vary across degree programs but typically include filing a program of study, passing a preliminary exam, successful defense of a thesis or dissertation, submitting the properly formatted electronic thesis, and graduation. By providing this information in disaggregated form (i.e., at the student level), it became much easier to identify issues with individual students or advisors as well as shortcomings in the accuracy or quality of the data.

Second, we wanted to support the assessment of program performance. When students in a program are much less likely to file their programs of study in a timely fashion, there is a need for a procedural change or training of advisors. If a program has a much lower rate of students successfully completing a preliminary examination, there is a need to evaluate the possible sources of that shortfall including a failure to prepare students for the exam, a poorly structured exam, unrealistic expectations of performance, or the admission of students who cannot succeed.

Third, aggregated data empowers the university and individual programs to identify possible differences in degree progress across demographics like full-time status, gender, funding source, domestic versus international, and race/ethnicity. The existence of such relationships may suggest procedural changes. To be properly handled, serious statistical analysis is necessary to ensure that lurking variables have been accounted for. For example, if a particularly challenging program tends to enroll more men, men’s performance may appear to be poor when considered in the aggregate if program competitiveness is not considered (an example of Simpson’s paradox, [36]).

3.1 Developing the dashboard

In the service of these goals, our university’s Graduate School and its Office of Data, Assessment, and Institutional Research partnered to create a Power BI tool to extract and display milestone data for graduate programs. Producing a useful dashboard presents challenges, and success requires a strong partnership with individuals who will be responsive and detail-oriented to ensure that the dashboard presents consistent and useful metrics, visualizations, and options.

Consistency began by establishing good operational definitions of terms, some of which are explained in an existing graduate policy handbook. In some cases, agreement came readily—the date on which a milestone was completed was the date entered by the Graduate School. In order to distinguish between a student who is still actively pursuing their degree and one who is not, we needed an agreed-upon time frame (a student was judged inactive if they had not enrolled in any of the previous 6 semester census dates). However, in some cases a clear and straightforward definition was adopted that proved problematic.

For example, identifying a student as part-time versus full-time initially relied on an established and widely accepted definition. But, this definition is semester-specific. We discovered that a vast majority of students were full-time in some semesters and part-time in others and thus appeared in reports based on full-time students as well as in reports based on part-time students. If we wanted to classify a student as part- or full-time for the duration of their enrollment in the program, this definition was not suitable. To ensure that a common method was used, we met with college administrators to discuss the issue. This discussion resulted in two metrics (a) a student was classified as full-time only if they were full-time for a majority of their semesters enrolled in the program, and (b) a student was assigned a numeric value which indicated the percentage of semesters in which they were enrolled full-time. Classifying a student into majority/minority full-time made it easier to report data based on this distinction—a common request. Supplementing this classification with a continuous metric allowed more sophisticated follow-up analysis based on a student’s percent full-time enrollment. Similar issues arose when trying to classify a student as funded or not, or as funded by a GTA versus a GRA, again because many students switch their funding status one or more times during their time at the university.

Even more problematic was the definition of how many years or student credit hours it took to achieve each milestone. We discovered that departments varied in their definition of when a doctoral student started the program. Some departments admitted students with a Bachelor’s degree directly into the doctoral and Master’s programs simultaneously, and the student pursued the Master’s along the way. Others admitted these students into their Master’s program and changed their enrollment status to the doctoral program only after completing the Master’s degree. Yet others only admitted these students into the doctoral program and added the Master’s program when the program of study for the Master’s was submitted a year or two into their doctoral program (or students skipped the Master’s altogether). Furthermore, not all students were treated the same way within the same department—a new department head or graduate program director may have changed practice during the change in leadership. This variation made it impossible to compare and contrast the number of years or credits to each milestone when some programs counted the time in the Master’s, some did not, and others started the Master’s part-way into the doctorate. Thus, it became imperative to adopt a single definition within our dashboard and to communicate that definition to all users. We opted for time to degree from graduate school entry as used by the National Science Foundation’s Survey of Earned Doctorates [37].

These examples of challenging situations are provided not to discourage the development of a decision aid but rather to ensure that proper care and time are taken in its creation. If the validity of the data is questioned due to misunderstood or inconsistent definitions, the many benefits of providing data will be undermined and can harm future endeavors to create a data-informed culture.

3.2 The dashboard

Our dashboard was divided across six pages:

  • a program summary with pre-established definitions and little opportunity for filtering,

  • a table showing the time to each milestone with the opportunity to drill down to the individual student level and to filter in multiple ways,

  • a graphical representation of this data,

  • a second table based on student credit hours to each milestone,

  • a page for exporting data into an Excel spreadsheet based on filter settings, and

  • a final page providing the operational definitions of terms.

The initial page showing the program summary was designed to ensure that a standardized report was available to all users. Given that the other four reports allowed a user to focus on particular years, student demographics, student status, etc., these reports can look quite different when users apply different filters. Thus, the report summary is based on standard definitions—percent of students filing a program of study within the past 2 years, Master’s students active within the past 4 years, and PhD. students active within the past 7 years, with all data automatically separated by majority full-time/part-time and with college-level and university-level benchmarks provided. Thus, there is no possibility that one user would be showing data for the most recent 4 years, another for the past 10 years, and yet a third showing a 4-year span but from a different 4 years than the first user.

The next four pages, however, allow users to obtain a more nuanced view of the data by filtering based on certain student characteristics, their advisor, first-enrolled semester, current status (e.g., active, inactive, graduated), most recent milestone met, employment status, race/ethnicity, and gender. These pages also allow the user to see student-level data with student names. This feature allows users to track individual students, identify those students needing follow-up, and explain unusual data values (e.g., a long time to meet a benchmark due to Pat Smith taking a leave of absence or Azariah Young switching to part-time status and completing the degree remotely). Our decision to provide student-level data required the need to limit access to those users who are up-to-date on their FERPA (Family Educational Rights and Privacy Act) training. Access to the much more comprehensive export table is limited to Graduate School personnel and, by request, to Associate Deans who oversee graduate education within each college.

The ability to examine individual students also served a secondary goal of empowering program administrators with the ability to work with the Graduate School or the Office of Institutional Research to correct errors in student records when, for example, a student has an impossible value (like a negative time to a milestone) or a suspicious one (like 15 years to file a program of study). This transparency increased trust in the data by helping users to understand that they could help fix a problem and that a problem might be in a data source over which they have control.

Figure 1 provides a snapshot of the type of report that can be generated based on years to each milestone in tabular form. In this example, we expanded the data for doctoral programs to the college level and further expanded College B to the department level. It is possible to continue down to the program and individual student level. This example is filtered to just those students who have graduated, thus the time in program is the same as the time to graduation. By sorting based on the Time in Program column, it is easy to identify those colleges and departments that have struggled to graduate their students in a timely manner.

Fig. 1
figure 1

The years to milestones data for graduated students in tabular form

As is always the case with data, reports like the table shown in Fig. 1 prompt questions that can help contextualize the data—is Department 1 enrolling more part-time students than the other departments, are students in this department less likely to receive assistantships thus extending their time to graduate, are these students more likely to find employment before graduating and thus transitioning to part-time, are faculty keeping students around to work on grant-funded research thus slowing their progress to degree, etc.

Figure 2 provides part of a student-level graphical report for a specific college at our university with student identities redacted. This report was filtered to only show active students enrolled in any of their graduate programs. Certain elements of the report would prompt further attention: why did it take one student 18.3 years to file a program of study, and why have two students who are currently active in the program been enrolled for more than 5 years without filing a program of study? There are often good explanations for these situations, but visualizations like that provided in Fig. 2 make it much easier to identify students, advisors, and situations that might need attention.

Fig. 2
figure 2

The years to milestones data for active students in graphical form. Student names have been redacted

Our milestones dashboard also provides tabular data showing the number of student credit hours to each milestone. At first glance, it would be easy to assume that these data would be highly correlated with the time to a milestone, but this is not the case. Part-time students would show more years to reach a milestone but may achieve that mark after a similar number of student credit hours. Student credit hour reports have also allowed us to identify departmental practices resulting in students taking an excessive number of courses—many more than the number of credits required for the degree. In one situation, a department enrolled their doctoral students in summer credits in order for them to teach summer courses, but this practice was actually slowing their progress to degree and resulting in their taking as many as 50% more credit hours than required. In another situation, students in a college were enrolling in one or more certificate programs concurrently with their graduate degree thus resulting in their taking more credit hours, and this practice was judged to better prepare the students for their chosen job market.

3.3 Rolling out the dashboard—challenges and successes

Given the surfeit of dashboards being developed and released by universities, administrators can be overwhelmed by the amount of data now available to them [34]. It is necessary to ensure that the right people are provided the right data at the right time and in the best format to maximize the data’s impact on student progress and departmental practices. A data-informed approach to graduate education needs to ensure that data are examined on a regular basis and not merely when a program is undergoing review every 5 to 10 years. Furthermore, due to the decentralized nature of graduate education and the resulting variability across programs, the users who can put the data to the best use are located within the departments because they know the context that is necessary for interpreting the data. College administrators and those in the Graduate School also need to examine program data in order to ask questions that may reveal a challenge faced by a department, a unique situation, or a problematic practice.

To address these disparate audiences, we made our milestones dashboard available to department heads, graduate program directors, appropriate graduate staff, associate deans, and other high-level administrators. Student-level data will be of greatest interest to department leaders and graduate program directors whereas program-, department-, and college-level data will be of greatest interest to college and university administrators. Usability was maximized by presenting the non-customizable program summary that uses the same default settings for all programs. Some customization is possible through filters on the other dashboard tabs to provide flexibility to address questions involving demographics, advisor, funding, subplan, etc. Although some programs have additional, program-specific milestones that they might like tracked (e.g., a second-year qualifying exam, completion of coursework requirements, proposal meetings, an oral examination), we have built the dashboard based on typical milestones that each graduate student must complete per Graduate School policies. We cannot customize the milestones beyond those already provided.

Because some users may consult any one dashboard only once a term or when compiling annual reports, creating step-by-step instructions in the use of a dashboard was clearly necessary. This need is even more important given that administrators often rotate through roles every few years. Although many dashboard skills transcend any individual report (e.g., how to sort a column or how to expand college-level data to examine department-level data), many elements may be specific to a report—especially as they relate to the nature of the data. We created instructions that (a) addressed the mechanics of how to use, filter, and save the data, and (b) identified situations that would be likely to create confusion. For example, one confusing situation included Master’s data in which students appear to take longer to file an ETDR (electronic thesis, dissertation, or report) than to graduate. This situation arose when the report included students in Master’s programs with thesis requirements (and thus filing an ETDR) and those in course-only programs (and thus not filing an ETDR). Given that course-only students finish faster, their inclusion in the average time-to-degree but not the average time-to-ETDR resulted in this apparent anomaly. Explicitly identifying these situations in our instructions helps to maintain confidence in the data that might be otherwise dismissed as unreliable.

After releasing the dashboard, Graduate School administrators were able to use existing PowerBI tools to assess dashboard use. The biggest users of our milestones dashboard are currently Graduate School administrators and personnel in programs undergoing review. Other users included college administrators, graduate program directors, department staff, and even recruiters (e.g., to obtain data to use for marketing purposes).

4 Understanding and leveraging institutional trends: analysis challenges

Although the milestones dashboard was created to serve the immediate need to improve local practices in moving students through a program, the ready access to the data has proven extremely useful to Graduate School administrators. In addition to supporting programs in their self-assessments, targeting departments for recruitment assistance, and providing workshops on which actions might address departmental challenges, the data can be mined to identify trends and predictive relationships as well as the effectiveness of interventions. Doing so, however, presents statistical challenges that require a more sophisticated approach than merely counting and averaging or conducting simple regressions.

The first challenge is one of lurking variables. Because students are not randomly assigned to their majors, self-selection biases can drive results. At our university, women and men are present in different proportions across graduate programs (e.g., women represent 76% of Master’s students in the College of Education and 25% of doctoral students in the College of Engineering), women represent a sizeable majority of Master’s (57%) and certificate (67%) students but are a minority among doctoral students (46%), and women enroll full-time in a semester (52%) less often than men do (57%). A failure to recognize the lurking variables of major, degree type, and full-time enrollment (and others) will thus contaminate attempts to understand differences in outcomes for women and men. For example, a simple analysis of our graduate student body revealed that women were taking fewer years to earn their degrees than men (Ms = 2.8 years vs. 3.0 years, respectively, t = 7.71, p < 0.001). However, women are more likely to be enrolled in Master’s programs—a lurking variable when trying to understand this difference. When the type of degree was added to this model, the effect reversed (Ms = 3.1 years for women and 3.0 years for men, t = 2.66, p = 0.008), an effect that was driven by women taking 0.2 years longer to complete certificate programs but with no significant difference for Master’s and doctoral programs. This reversal is a classical example of Simpson’s paradox [36].

A second challenge in analyzing graduate student data is the large disparities in the number of students in each major and demographic thus creating unbalanced data. Some programs at our university have headcounts as few as 5 students whereas some have 500 or more students. When data are aggregated across programs before presentation or analysis, these statistical summaries and trends will be dominated by a handful of large programs. For example, in the aggregate it may look like the university is making a strong move toward more online doctoral students, but this trend may be strongly positive in one large college but slightly negative in most other colleges. Although noteworthy for the one college, it would be incorrect to assume that the average upward trend reflects trends in most of the university’s colleges or programs.

To tackle this challenge, university and college-level data are best analyzed using multilevel modeling [38,39,40] because of its strengths in addressing unbalanced designs, the ability to model individual differences in departments without using a many-leveled variable like “program” as a predictor, its modeling of dependent data, and its efficient handling of missing values (for applications in higher education, see [35, 41, 42]). Multilevel modeling can capture trends for programs, departments, or colleges that are typical of most of those units. This feat is accomplished by the model fitting trends at the level of individual units (e.g., departments or programs) and at the highest level (e.g., the university) at the same time. Those units which are very small lack the data to make strong statements about unit-level trends, and thus conclusions regarding a trend within those units will be more strongly determined by what is typical in most units (small and large). Once a unit is sufficiently large to support strong inferences about its trend, that unit’s analysis results will be much less influenced by other units’ data and more strongly determined by its own data. When our earlier analysis of gender differences across degree types was repeated using a multilevel model with program as a random effect, the difference remained (overall Ms = 2.8 years for women and 2.6 years for men, t = 2.43, p = 0.015) and was again specific to certificate programs (Ms = 1.7 vs. 1.4 years, p < 0.001), but the estimated values were slightly lower because these lower numbers were more representative of the typical program rather than being dominated by the few large programs that happened to have longer years to completion.

It is important to remember that even sophisticated analytical tools are still limited by the quality and complexities of the data. Dramatic growth or decline can be the result of the creation or closure of a program, respectively, the movement of a program from one college to another that results in the data for the program being split between two different departments or colleges, or the movement of a program from on-campus to on-line if this movement was captured by using different program names.

There is no substitute for good data science skills and an understanding of one’s organization if data are going to be used to inform graduate education. A poorly conducted analysis can quickly undermine trust in a decision aid (or the administrator using the tool). Perhaps worse, a bad analysis may never be questioned and yet still prompt changes that are ineffective, misguided, or even harmful. Data is powerful but must be handled with care, and simple statistical summaries can misrepresent as badly as a poorly conducted multilevel analysis.

5 What comes next

Throughout the history of graduate higher education, administrators have always found it necessary to update their program curricula, determine the best way to assess student performance, develop good mentoring, graduate their students, and make numerous other changes in an attempt to improve the quality of the graduate education being delivered. Throughout much of this history, however, these decisions were made intuitively as a result of personal experience. The emergence of data science across fields as diverse as sports, business, medicine, weather, and energy has revealed that many of our intuitions have been poor proxies for good judgment. But, many in these fields have resisted the undermining of their ostensibly uniquely superior intuition (dramatically illustrated in the film Moneyball), and higher education is encountering the same resistance [33, 43, 44]. Ultimately, better performance wins out, especially in domains like sports and the weather where predictive errors are very transparent [45,46,47].

Trust in the accuracy and relevance of data will lower the resistance to its use. Given that data is just information, it would be the rare academic who believes that decisions, policy, and procedure should ignore information. People want information, but incorrect data are worse than no data, so the hard work of data validation is critical to the success of our milestones dashboard. Resistance also arises when departments feel that data is being used as a cudgel for punishment rather than to inform practice and procedures. By providing the milestones dashboard to department chairs, graduate program directors, and graduate faculty, they were empowered to identify any data errors, to use the data to discuss academic program practices in the department such as “is our way to structure preliminary exams supporting progress towards graduation?” and to use it for their accreditation or program review processes.

Beyond the solicited and unsolicited feedback that the Graduate School obtains from dashboard users, we also have annual one-on-one meetings with each college’s Associate Dean who oversees graduate education along with monthly meetings with graduate program coordinators. The meetings with the Associate Deans comprise a discussion of the college’s current data, assisting the Associate Dean’s understanding of the data, improving the Graduate School’s understanding of the college context, and soliciting the need for additional variables or features. These and other meetings have resulted in adding new variables like veteran status to support a unit’s accreditation process, highlighting slow degree progress in two units that resulted in their re-evaluation of their use of qualifying examinations, and prompting serious discussion of the viability of some programs given their low enrollment and high attrition. The monthly meetings with graduate program coordinators are centered around best practices and how data helps us identify best practices. Furthermore, programs are now more aware of the impact of a thesis requirement on degree progress; some programs are now considering non-thesis options for Master’s students who are not progressing to doctoral programs.

The present article stopped short of making specific recommendations to revise graduate education because we have shown that many of the solutions are local to a program, department, college, or university. The data we have examined using our dashboard has made this conclusion abundantly clear because data profiles vary, often dramatically, across programs. When we ask questions, we can discover that a difference is due to a specific administrative assistant, an enterprising and attentive long-serving graduate program director, a handful of difficult faculty, or a failure to hold students or faculty accountable for degree progress. These situations existed before our dashboard was released, but seeing the data prompted closer scrutiny of current practices and engagement in conversations about updating academic program level practices and Graduate School level practices and policies.

As we have continued to investigate local trends in graduate education, questions continue to give rise to the need to track more data. Future challenges will include the definition of student success (e.g., beyond easy metrics like high GPA, short time-to-degree, and graduation to include metrics more difficult to measure like quality of job placement, skill level, and student well-being) in order to better assess the factors that determine such success. Thus, we see our dashboard as a living data source that is intended to reward curiosity by growing and adapting in response to needs. Our dashboard has already spawned a companion dashboard in which we are tracking graduate applicants as they progress through our admissions process, and the original launch of our milestones dashboard gave rise to a second version with clearer definitions, and a third version was released in January 2024 in response to new discoveries and needs. The latest version tracks more variables to support predictive analysis, and we added a retention chart to help programs identify the proportion of students falling into three key categories—still active, not retained, and graduated—across the years since enrollment. In some cases, data confirm things that we already believed were true, but in many others the data have challenged our beliefs and revealed new truths. As these truths are put into action, graduate education is sure to benefit.