Model selection
Using the stm package in R (Roberts et al. 2014a), we estimate a correlated topic model (CTM) which is an extension of Latent Dirichlet Allocation (LDA). CTM relaxes an assumption made by LDA by allowing the occurrence of topics to be correlated (Blei and Lafferty 2007; Blei and Lafferty 2009). We use the spectral method of initialization as it guarantees obtaining the globally optimal parameters and, compared with other methods of initialization, is faster and produces better results (Roberts et al. 2016). Topic modeling is an exploratory technique and will give researchers any number of topics they request. There is no ‘right’ number of topics, and the number of topics should be chosen based on interpretability and analytic utility with regard to the research question (DiMaggio et al. 2013; Grimmer and Stewart 2013; Roberts et al. 2014b). We generated a set of candidate models with a different number of topics (i.e. 5, 10, 20, 30, 50, and 100). This informed us that the ideal ‘level of granularity of the view into the data’ (Roberts et al. 2014b:1069) ranged between 30 and 40. Estimating more models in the range of 30–40, we finally selected the 31 topic solution as it is superior to the other models with regard to our research question. This does not mean that our analysis proves that there are exactly 31 topics in the field of research on higher education. Rather, we select the model with 31 topics as this gave us the best compromise between parsimony, doing justice to the variety of themes in the field, and substantial analytical interpretability. In addition, models with more than 31 topics include topics with very low relative prevalence indicating that these topics are peripheral to the field, which do not offer insights in the overall structure of the field.
As mentioned earlier, a topic is defined as a distribution over all observed words in the corpus. Investigating the most probable words of topics is the way to interpret the substantial meaning of each topic and to label each topic. To improve interpretability, we rank words on topics based on frequency—i.e. the occurrence rate of a word in a topic—as well as their exclusivity—i.e. the extent a word is exclusive to a topic. For this, we rely on the FREX measure (FRequency and EXclusivity) which is the mean of a word’s rank in terms of exclusivity and frequency (Airoldi and Bischof 2016). Next to the per-topic-per-word probabilities (β), the analysis also yields the per-document-per-topic probabilities (γ). The γ-probabilities inform us which articles load on which topic. A close reading of high loading documents on each topic further helps us in the interpretation and the validation of the model.
To improve interpretability of the results, we present the 31 topics in three sets. Note that this is an ex post decision after having estimated the topic model. The first set relates to topics characteristic for research with a theoretical focus on individuals, while the second set relates to research interested in organisation- and system-level mechanisms. The third set of topics includes discipline-specific topics and a topic related to methods. Next to a label (which both authors agreed upon after close reading of the highest loading articles and in-depth discussion), we also assign a number to each topic. This number is not substantially meaningful and is just a way to guide the reader through the results.
Individual level topics
The 16 topics related to individual level themes can be grouped in four categories. The most prominent words for the topics in these categories are presented in Table 2. The first category relates to student health. For example, the three most prominent words for the first topic—which we label substance use and health—are ‘behavior’, ‘intervention’, and ‘alcohol’. The article most strongly associated with this topic studies drinking and driving among college students (Fromme et al. 2010). Similarly, the other three topics relate to aspects of student health, i.e. stress and anxiety (2) (e.g. ‘stress’, ‘emotion’, and ‘anxiety’), sexual activity and health (3) (e.g. ‘female’, ‘sexual’, and ‘male’), and mental health (4) (e.g. ‘mental’ and ‘treatment’).
Table 2 The ten highest-ranked words on the individual level topics (relative prevalence of each topic between parentheses) The second group of individual level topics relates to different subgroups of students. For example, the topic internationally mobile students (5) includes words such as ‘international’, ‘mobility’, and ‘abroad’. The highest loading article on this topic studies the way perception of domestic and foreign higher education systems drives students’ international mobility (Park 2009). One of the top words of this topic is ‘Chinese’. This indicates that international mobility is often studied with regard to Chinese students and that ‘Chinese’ is an exclusive word to that topic (i.e. it occurs very infrequently in other topics). The two other topics pertain to subgroups related to ethnicity. Topic 6 pertains to racial and ethnic minorities (e.g. ‘African’, ‘black’, ‘white’, and ‘ethnic’), while topic 7 focuses on ethnic diversity and on the way this affects experiences on campus.
The next group of individual level topics relate to various aspects of pedagogy. For example, topic 10 focuses on student performance in relation to parenting styles. The highest loading article on this topic studies how parental attachment, parental education, and parental expectations affect academic achievement (Yazedjian et al. 2009). Other topics in this group relate to feedback on assessment (8) (e.g. ‘feedback’ and ‘assess’), cognitive styles (9) (e.g. ‘complex’, ‘cognition’, and ‘understand’), skills, training and development (12) (e.g. ‘skill’, ‘develop’, and ‘curriculum’), and educational technology (11) (e.g. ‘learn’, ‘online’, and ‘technology’).
The final group of topics on the individual level focuses on academics. For example, topic 13 centres on academic careers and mentoring (e.g. ‘faculty’, ‘career’, and ‘program’). The highest loading article on this topic provides information for undergraduate advisors on how to assist students in identifying graduate programs (Shoenfelt et al. 2015). The second highest loading article studies the retired professor (McMorrow and Baldwin 1991). A second topic, i.e. teaching practices (14), captures research on teaching styles and preferences. A third topic, i.e. changing academic careers (15), relates to, for example, work/life balance of academics with young children (Currie and Eveline 2011), and identity-formation (Cox et al. 2012). The final topic in this group relates to research on doctoral students and supervision (16) (e.g. ‘supervisor’, and ‘PhD’).
Organisation- and system-level topics
The most prominent words for the topics in this set are presented in Table 3. Two groups of topics can be distinguished among this set, i.e. topics related to the organisational and to the system level. At the organisation level, the first, rather big topic (4.0% of the corpus), includes top words such as ‘policies’, ‘quality’, and ‘governance’. This topic (17) captures research dealing with quality assurance and accountability of higher education institutions. The highest loading article on this topic studies the impact of cross-border accreditation on national quality assurance agencies (Hou 2014). The next two topics on the organisation level pertain to leadership (18) and strategy and mission (19). For example, the highest loading article on leadership studies the development of a leadership identity (Komives et al. 2005). The topic model also discovers a topic related to sustainability (20), with top words such as ‘sustainability’, ‘plan’, and ‘environment’. Articles loading high on this topic study, for example, sustainability plans of higher education institutions (e.g. Swearingen 2014).
A final topic on the organisational level focuses on organisational change (21). The highest loading article on this topic studies decision-making processes in universities and develops a typology of strategies (Bourgeois and Nizet 1993). It was challenging to find an appropriate label for this topic, as it is characterized by quite some internal variation. For example, the second highest loading article studies inter-institutional cooperation (Lang 2002), while the third highest loading article addresses changes in funding American higher education (Johnstone 1998).
Table 3 The ten highest-ranked words on the organisation- and system-level topics (relative prevalence of each topic between parentheses) The second group of topics captures research on the system level. Topic 23, for example deals with university rankings and performance (e.g. ‘rank’, ‘fund’, and ‘university’). The highest loading article on this topic studies how different indicators of university rankings are related to each other (Soh 2015). Topic 22 relates to knowledge society and globalization, and the most characteristic article for this topic applies a network approach to globalizing higher education (Chow and Loo 2015).
The final topic on system level focuses on student financial aid (24) (‘enrol’, ‘financial’, and ‘aid’). Articles loading high on this topic approach financial aid from a system-level approach, for example, by studying state support for financial aid programs (Doyle 2010). However, this topic also captures research approach financial aid on an individual level. For example, Kofoed (2017) studies why some students apply for financial aid, while other do not.
Other topics
The final set of topics contains topics that are either very specific or very generic and the most prominent words for these topics are presented in Table 4. For example, top words for the topic on physics and engineering (25) are, for example ‘engineer’, ‘instruct’, and ‘solve’. All articles loading high on this topic are published in the Journal of Engineering Education. The other discipline-specific topics, i.e. topics 26–29, are also very specific to certain journals. We also find two generic topics: one on research ethics (30) and methods (31). Because these other topics are very specific to a journal or very generic, they do not provide us much insight into the structure of the field of research on higher education. Therefore, in the following analysis, we pay less attention to this set of topics.
Table 4 The ten highest-ranked words on the other topics (relative prevalence of each topic between parentheses) Topics through time
To study the way topics evolve over time, we plot the posterior per-document-per-topic probabilities (γ) against the year of publication. We estimate Locally Estimated Scatterplot Smoothing (LOESS) which gives use a smooth curve summarizing the data points (smoothing span/alpha set to 0.7). In this way, Fig. 1 shows the evolution of the relative prevalence of each topic through time. It is clear that topics within a certain group of topics do not necessarily follow similar evolutions.
Several topics seem to be on the rise: internationally mobile students (5), feedback on assessment (8), educational technology (11), doctoral students and supervision (16), leadership (18), and knowledge society, and globalization (22). Other topics, on the other hand, clearly lose ground over time: sexual activity and health (3), parenting styles and student performance (10), and academic careers and mentoring (13). In addition, there are a few topics clearly going up and down: substance use and health (1), quality assurance and accountability (17), and student financial aid (24). Some developments are counterintuitive. For example, one might expect increased attention to the efficiency and economics of higher education in light of greater attention to these themes in public policy. Also, we expected a more monotonous growth in attention to quality assurance and accountability.
Clusters of topics
Remember that our approach models each article as a mixture of topics. Therefore, a way to map the structure of the field of research on higher education is to reveal which topics tend to be combined in articles. For this, we use a Q-mode cluster analysis on the document-topic probability distributions.Footnote 1 We tested whether the cluster solution is robust over time by comparing the cluster solution presented here with the cluster solution for the oldest articles in our data and with the one for the most recent articles. The substantial interpretation remains the same. So, while the topics are in constant flux (see above), the way topics are clustered remains stable over time. Therefore, we present the cluster solution for the complete data which covers the period 1991–2018.
Figure 2 presents the dendrogram yielded by the hierarchical clustering on the topics. The distance indicates similarity between topics. Topics that ‘meet’ at a smaller distance are more similar in terms of their distribution over the documents, compared with topics that meet at a higher distance. In this way, topics that cluster together are topics that are often combined in research. For example, ‘racial and ethnic minorities’ and ‘racial/ethnic diversity and campus climate’ are the most similar topics as they link around 0.75. The topic ‘racial and ethnic minorities’ is, on the other hand, very dissimilar to other topics. For example, the topics ‘racial and ethnic minorities’ and ‘feedback on assessment’ only meet at distance 7, which indicates that both topics are very rarely combined in research.
The dendrogram is valuable in showing the different ‘islands’ in the field of research on higher education. At distance 7, we see two clusters. The left cluster includes topics related to subgroups of students and students’ health. The right cluster combines topics on the system/organisation level and the pedagogical topics. Around the distance of 5, we see that this second cluster is again split up.
The clustering shows that topics we included in the same set often tend to cluster together. For example, the four topics on student health cluster together in the dendrogram. But, topics on subgroups of students and topics on pedagogy—both individuals level based topics—are very distant from each other. In this way, our analysis identifies potential gaps in the literature. Our analysis, for example, suggests that there are very few studies in the field of research on higher education that combine a focus on pedagogy with a focus on racial and ethnic minorities. The lack of research combining both sets of topics is remarkable as both topics are very often combined in the sociology of education. Indeed, the relation between ethnicity and achievement in primary and secondary education is a blossoming field (e.g. Dworkin and Stevens 2014). Similarly, the dendrogram exposes a lack of research on the combination between organisational- and system-level topics on the one hand, and student outcomes on the other. This is reminiscent of Jackson and Kile (2004:286) who argued that ‘new frameworks (e.g. theories, models, and concepts) are needed to help understand the various ways institutions affect students’. We believe that addressing the gaps identified by our analysis—i.e. topics that are rarely considered simultaneously in existing research—provide a lot of opportunities for research.
Topic diversity
We already shared that the clusters of topics (‘islands’) remain stable over time. However, the cluster analysis does not indicate whether or not the islands tend to move further apart from each other. To address this, we compute the Shannon Entropy index (for topics 1 to 24) which measures the relative balance of the topics in each abstract and gives us an indication of the topic diversity of each article. The lower the value on the Shannon entropy index, the more an article exclusively focusses on one topic (and vice versa). The left-hand panel of Fig. 3 shows the evolution of topic diversity over time for the entire corpus, while the right-hand panel breaks it down for each journal type. The general trend is clearly downward. That is, more recent articles are characterized by less topic diversity.Footnote 2
The right-hand panel of Fig. 3 shows that the general downward trend can be attributed to articles published in topic-specific and discipline-specific journals. The topic diversity of articles in generic journals is the highest, which indicates that our measure of topic diversity indeed captures topic specialisation and has remained stable over time. Articles published in other journals, however, show a clear trend towards specialization.