Introduction

In recent decades, there have been dramatic changes in the forest sector due to global trends, policy processes, and changing structures in the markets of forest-based products. All these have impacted or will impact forests and their use. The world has lost 178 million ha of forest since 1990 [1], and agricultural expansion is one of the main reasons for this. Although the area of protected forest has increased, forest biodiversity has declined. Globally, biodiversity is declining faster than at any time in human history [2], and climate change is in turn reinforcing this development. On the other hand, forests have a major actual and potential role to play in global climate change mitigation. Trees and forest soils sequester carbon, wood products store carbon throughout their lifespan, and forest-based products and energy play a role as renewable substitutes for fossil-based products. The role of forests has been recognized in global policy processes, e.g., in the Paris Agreement on climate change [3], the United Nations Strategic Plan for Forests 2017–2030 [4], and the United Nations Sustainable Development Goals [5].

Forests play an increasingly important role in sustaining and increasing people’s overall well-being, and they make significant contributions to the United Nations Sustainable Development Goals (SDGs) [5]. People around the world depend on forests for their livelihoods, and rely on forest products such as timber and firewood, as well as their social, cultural, and environmental benefits. In forest management, the concept of multiple use has been increasing in recent decades [6]. However, in the light of the rapid growth of the world’s population, there is a threat of the increasingly unsustainable use of forests, which may exacerbate the existing global sustainability challenges like climate change and biodiversity loss.

Dramatic changes in global forest industry markets during the 2000s [7, 8] have also changed the chains of activities that create and add value to forest products through different phases of production to final consumption. With new products and processes, changes in value chains are expected to continue [9]. In addition to climate, environmental, and other policies, other global drivers of forest product demand, like economic development, demographics, customer preferences, technological developments, and globalization [10], affect the markets. Globalization, assisted by the cost-reducing effects of technological development, has led to increased trade in forest products and the possibility to utilize materials from different sources. Consumption and production are shifting from the traditional forest industry regions of North America, Western Europe, and Japan to rapidly growing markets like China [8].

At a time of global change, it is important that forest research reflects the requirements of the future. Analyses of the existing literature and trends are important tools in identifying knowledge gaps. Other information on the forest sector is also needed, such as results from foresight analyses and knowledge of emerging issues to help inform scientific research in a changing environment. Scientific information is a backbone of evidence-based policymaking, and gaps in knowledge may thus have a direct impact on practical decision-making.

Previous reviews of the academic forest science literature have focused on specific subject areas like forest ecology [11], deforestation [12], multiple use forest management [6], outdoor recreation- and nature-based tourism [13], the sustainability of wood products [14], the potential of new industrial wood-based products [9], and sustainability communication in the forest sector [15]. Few previous syntheses of research trends in the international academic forest sector literature cover the entire FWC and use systematic methods of database searches. In the previous study of [16•], based on the years from 1979 to 2008, a clear increase in research on ecology and climate change was detected, but the development of research on economic, political, and especially social factors remained weak. A more recent worldwide review focusing on forest ecosystem services (FES) indicated a clear growth for social science research since 2008, while the growth of economic research remained limited [17••].

Methods for forest literature analysis and reviews have ranged from document analysis gathered manually to the more established systematic methodologies of database searches [15]. Common approaches have been the use of predefined keywords in data processing, like keywords or title words obtained from research articles, or single words defined by the author [14, 16•, 17••, 18]. The present study applies the topic-modeling method [e.g., 19]. Instead of pre-defined keywords that may condition the results, topic modeling allows unsupervised mapping, in which pre-defined keywords are unnecessary. This is the key benefit of this method compared with practices using predefined (fixed) words. Topic modeling allows emerging patterns in data to be discovered, thereby also allowing the identification of novel knowledge. To our knowledge, this is the first time that topic modeling has been applied to an analysis of the forest science literature.

The aim of the study is to have an overview of worldwide academic forest sector research and to gain an insight into the existing trends. A quantitative topic-modeling method is tested to extract topics from the data. The analysis is based on the abstracts of 15 leading international peer-reviewed forest science journals obtained from the Web of Science [20]. The forest wood chain (FWC) constitutes the study’s structural framework. It is divided into four stages: forest resources and management; utilization of forests; industry processing and products; and end use, markets, and trade. PESTE analysis is applied to examine political, economic, social, technological, and ecological/environmental factors. The research questions to be answered are as follows:

  1. (1)

    What main topic groups are evident in the four stages of the FWC from 2000 to 2019, and what are their trends over time?

  2. (2)

    What are the main PESTE factors in the four stages of the FWC from 2000 to 2019, and what are their trends over time?

  3. (3)

    Can we define research gaps?

Framework of the Study

The scope of forest science is broad, and it is influenced by a crossover from several related disciplines like botany, plant science, ecology, environmental science, chemistry, and agronomy [18]. Various sections are related to policy, economic, sociological, and ecological/environmental factors. As an applied science, it also covers subjects related to the perspectives of different stakeholders with interests in forests. To structure the topics obtained from the topic-modeling process, the present study applied the forest wood chain (FWC) framework, separated into four stages and five PESTE factors (Fig. 1). The FWC stages include production chains from forest to industrial processing, and downstream products like paper and wood products [21]. The framework is greatly simplified in the sense that the FWC is an entity of complex and dynamic networks [22] that change with the development of new production processes and products [9]. The topics to be structured in the stages of the FWC are referred to later as theme 1 topics, theme 2 topics, etc., which are further divided into five PESTE factors.

Fig. 1
figure 1

Framework of the study

The first stage of the FWC includes forest ecosystems that are major providers of ecosystem services [23], for which biodiversity forms the basis. Global trends like climate change, socioeconomic development, the fragmentation of forest landscapes, and the emergence of new pests and diseases are increasing challenges for forests and forest planning [24]. In turn, the growing demand for forests for various purposes [25] indicates the increasing conflict between human requirements and forest biodiversity. All of this creates challenges for forest management to maintain and improve the production of wood and other forest ecosystem services.

The second stage of the FWC involves the traditional utilization of forests for timber production, as well as the production of other ecosystem services like physical raw materials other than wood, climate regulation, and cultural and social services [26]. The supply of non-timber forest products like forest berries and mushrooms plays a role in household use [27], and they are also related to recreational activities. The benefits of outdoor recreation, nature-based tourism, and health and well-being all form an important part of the values people gain from forests today [13, 28].

The third stage concerns the industrial processing of wood, which still focuses mainly on the production of pulp, paper, sawn wood, and other traditional wood products, although the forest industries are seeking ways to increase their profitability and competitiveness by testing and developing new wood-based materials, products, and modifications. In addition, new business strategies, processes, and technologies are being investigated—for example, the use of wood products for biofuels. The large potential of non-wood forest products as a source of renewable industrial raw material and medicine is increasingly recognized [27].

The last stage of the FWC covers demand for changes in end markets, which are drivers of changes along the entire FWC, from consumer or industrial end user to the use of forest resources and forest management. Emerging new product groups related to construction and packaging, textiles, biofuels, and platform chemicals [8, 9] will also change the supply needs of forest-based raw materials in the near future. An example of such a change is the collection of forest energy from harvesting residues and stumps. Furthermore, changes in consumer preferences related to the mitigation of climate change, for example, will probably have significant effects on the use of wood-based products.

The FWC stages are further examined using PESTE analysis covering political, economic, sociological, technological, and ecological/environmental factors. PESTE is a variation of the PEST analysis commonly applied in a business environment [29]. Political topics include, e.g., policies related to conflicts between human demands and forest nature (e.g., land-use, the forest as a resource, forest conservation). The forest sector is impacted by sectoral and cross-sectoral policies. Policies primarily targeting other sectors also have effects on the FWC (e.g., trade, energy, and climate policies). Among the economic factors are value adding, competitiveness, income, costs, and investments, while sociological factors are related to such areas as employment, cultural values, recreation, and human health and well-being [e.g., 30]. Technological factors are related to the presence and development of technologies, for example, in mechanical engineering, manufacturing processes, and communication. Ecological/environmental factors include ecological processes, biodiversity, conservation, carbon sequestration, pollutants, wastes, and recycling.

Methods and Data

Topic-Modeling Method

Topic models have been used widely for text mining and information retrieval in many research fields like the biological sciences [31]. In the study of [32•], topic modeling was applied to discover themes and trends in transportation research. To our knowledge, it has not been used in analyses of forest research. Topic models are generative models that provide a probabilistic framework for the term (word) occurrence in documents of a given corpus (i.e., a collection of texts being studied). In topic modeling, a document (here, an abstract) is modeled as a mixture of topics, and each topic is a mixture of words. In other words, each abstract consists of multiple topics, and each topic consists of multiple words. The key question in topic modeling concerns how to discover a topic distribution for each document and a word distribution for each topic [e.g., 31, 33].

As such, the topic models represent an unsupervised classification method. Instead of using fixed single keywords, topic modeling can extract information statistically from the data by clustering words into topics and topics into subject categories, thereby allowing emerging new information to be found. This is an important advantage of topic modeling compared with common human-assigned approaches [34]. The present study uses topic modeling in the extraction of topics from the abstract data. The classification of topics into FWC/PESTE categories is made qualitatively. Although a large amount of data can be analyzed by the topic-modeling method, human intelligence is needed to interpret and classify the topics.

A simplified description of an idea behind topic models can be presented by the following steps. First, we assume that we have a vocabulary of V different words/terms. Second, we assume that there is a fixed k number of topics, and each of those topics has its own word distribution (β), i.e., the probability that a word will be used in that topic. This means that certain words are likely to be used when talking/writing about certain topics. Third, there is a topic proportion (γ) for each document, which describes the share of each topic in that document. Finally, topic models assume that each document with N words is generated so that topic proportions are drawn first for the document. After this, for each N words, the topic is drawn from topic proportions, and then based on the drawn topic, the actual word is drawn from the topic’s word distribution.

The most common type of topic modeling is Latent Dirichlet Allocation, or LDA [35], in which topics are assumed to be uncorrelated. The natural addition to LDA is a correlated topic model (CTM) [36], in which the presence of one topic may be correlated with the presence of another. This improves the model by allowing a higher probability for some topics to co-occur than for others. This means that the occurrence of one topic makes another topic more (positive correlation) or less likely (negative correlation) to occur. Another addition is the structural topic model (STM), in which the document topic prevalence and topical content can be related to arbitrary document metadata, i.e., how much a document is associated with a topic, and the words used within a topic are related to covariates [37, 38]. For example, a covariate could be the journal in which a given article is published, and this could affect the document topic probability, meaning that some topics are more likely to be present in articles published in certain journals.

We used the CTM method in our topic models because of its ability to take account of the fact that some topics can occur more commonly in documents than others. For example, we may have three topics: harvesting technology; forest thinning; and customer preferences for plywood. We can assume that topics about harvesting technology and forest thinning are more likely to co-occur in documents than topics about forest thinning and customer preferences for plywood. CTM therefore also gives a better fit for document collection than the LDA method and enables the construction of topic graphs for visualizations of the relationships between topics [36].

Data Collection

The topic analysis was based on the abstracts of 15 global peer-reviewed forest sector research journals (Table 1). They are briefly characterized in Appendix A. The aim was to identify leading international journals covering the area of the FWC. Two searches with the keyword “forestry” were applied in selecting the journals [39, 40] in April 2020. Journals in languages other than English were excluded, because they have a more limited number of readers and authors. International forestry journals are for the most part published in English [41]. Furthermore, carrying out the study in several languages would have complicated the analysis significantly.

Table 1 Journals selected for analysis (Source: Web of Science [20])

The main criterion in selecting journals was to obtain a sample of leading international forest sector journals of a broad scope and covering the entire area of the FWC. To cover the entire area, some journals with a rather limited scope (e.g., related to wood sciences or economics) were included. This also meant that several journals focusing on forest management and ecology, an area that comprises a major share of all forest sciences research, had to be limited. This area is present in many contexts in the 15 selected journals, for example, the Canadian Journal of Forest Research, Forest Policy and Economics, and Forest Science. As the main aim of the study was to obtain an overview of forest sector research and an insight into the existing trends, we believe that the 15 selected journals reliably reflect the state and trends of research into the international FWC area.

Abstracts were used as proxies for the research articles in the study. It was assumed that an article’s abstract included the main themes and topics in a compact and accessible form that served the purposes of this study [42]. Altogether, 14,470 abstracts from the selected journals were obtained from the Web of Science [20] in March 2020 by setting the search for the period from January 2000 to December 2019.

The increase in the number of articles (and abstracts) per year during this period was significant (Fig. 2). Forest Policy and Economics, the Canadian Journal of Forest Research, and the Journal of Forestry Research especially showed growth. Meanwhile, the abstracts from the Forest Products Journal focusing on wood science and technology (Appendix A) declined. The total growth in the number of articles is somewhat indicative, because some journals did not appear in the Web of Science in 2000, even if they were published at that time. For example, the data for the European Journal of Wood and Wood Products were only available in the WEB of Science for the period between 2009 and 2019.

Fig. 2
figure 2

The number of articles published by selected journals during the period from January 2000 to December 2019

Data Processing

Data analysis was performed with statistical software R [43]. The first step in the data processing was to split (tokenize) abstracts into words and lemmatize (reduce multiple word forms to a single form, the word’s lemma) those words to produce vocabulary for the whole dataset. This step was performed using the udpipe package [44]. Lemmatization was used to manage the several inflected forms of words that normally appear. The next step was to condense or simplify the vocabulary by removing certain words (Table 2). First, we removed words belonging to the standard “stop word” list in natural language processing, i.e., articles and prepositions. For this purpose, a tidytext package in R [45] was used in text mining. In addition, all digits were removed (i.e., words in numbers) after it was found that they caused unwanted variation in model estimation. However, “words” with digits in certain connections, like chemical compounds (e.g., NO3, CH4, CO2), were left. Finally, n words occurring fewer than five times were removed. The resulting number of unique words in the vocabulary was 14,824 (Table 2). They occurred 1,508,569 times in all the abstracts. Using this vocabulary, the median number of words found in the evaluated abstracts was 105. With the 25 percentile, it was 85, and with the 75 percentile, 123.

Table 2 The vocabulary used in the analyses

Determining the Optimal Number of Topics

Topic models were estimated using the CTM method and R package [46]. Regardless of the method of calculation, the algorithm needs given input parameters, of which the most important is the number of topics K. It can be selected by quantitative testing and using qualitative analysis of the results [32•]. There is no definite measure for the “right” number of topics. Instead, multiple measures can be evaluated to find an “optimal” K. In the present study, four measures were applied. The first measure [35], based on topic density, selects the number of topics adaptively, using density clustering (the measure to be minimized). The second measure [47] chooses the optimal number of topics by the Kullback–Leibler (KL) divergence-based measure. It utilizes the idea that topic allocation can be presented as a matrix factorization (the measure to be minimized). The third measure, sematic coherence [48], is maximized when the most probable words in a given topic frequently co-occur. Another measure which is commonly used with semantic coherence is exclusivity [37]; i.e., the words in certain topic have only a high probability of occurrence in that topic and a low one in others. The fourth measure, held-out likelihood [49], also performs residual analysis [50].

Because the corpus of the present study is relatively small, the optimal number of topics was sought in the range [10,300], with a step size of 10 for the possible number of topics, K. Depending on the applied four measures, the optimal number was estimated to be approximately in the range of 130–200 topics (Appendix B). As Appendix B shows, the improvement in performance from using an increasing number of topics ceases at some point, and there is a relatively large plateau of quite similar performances. Appendix C presents the results of the semantic coherency and exclusivity [37] measures to select the “best” number of topics. The figure in Appendix C describes the efficient semantic coherency-exclusivity frontier, where the efficient model, i.e., the number of topics, can be chosen based on trade-off coherency-exclusivity. After examining a few different numbers (e.g., 120, 140, 160, 200, and 240) of topics (K), we selected the number of topics as K = 160. Consequently, 160 topics were extracted by topic modeling from the data and analyzed further.

Results

Classification of Topics into FWC/PESTE Categories

The estimated 160 topics were analyzed by their statistical properties, based on word topic probabilities \(\beta ,\) document topic probability \(\gamma\), and the correlation between document topic probabilities ρ. The extracted topics were also examined by visualizing them using word clouds [e.g., 32•, 51], which summarize the content of documents by highlighting the most frequently used words. For example, Appendix D presents a word cloud for topic number 1. In a word cloud, the text size of each word is in proportion to its probability. Words with the largest text sizes therefore have the highest probabilities of occurrences in the topic.

Classification was made manually by the research team into themes 1–4 (corresponding to the FWC stages) and PESTE categories 1–5. Each team member made the classification of the topics independently, after which the results were examined in the team’s work meetings to form a common view. Manual classification was used to incorporate human intelligence in the analysis. An alternative method would have been clustering, but this would have meant assuming that the clustered topics would have had a similar vocabulary. In this case, it could have been challenging to differentiate between themes and PESTE factors. For example, topics related to forest resources differ greatly if analyzed from an ecological or economic perspective. Manual classification of topics and their qualitative interpretation also provided a quality control for the topics identified by the statistical model. Table 3 summarizes the classification results. Some topics that seemed not to belong to any of the FWC themes were classified as “Other.” In Appendix E, all 160 topics are numbered and classified. For each topic, up to 7 top words with the highest probabilities \((\beta )\) of occurrences are presented.

Table 3 Classification of topics in FWC themes and PESTE categories

Themes 1 and 3 had the largest number of topics. The results of PESTE analysis showed that ecological and technological topics had the largest proportion. In our data, ecological research on forests and technological research into industry and product technologies have dominated forest sector research in the last two decades. In the following, we construct an overview of topics classified in the FWC/PESTE categories.

In the largest theme (forest resources and management), most of the PESTE topics were ecological, but topics representing policy, economic, social, and technological factors were also found. Because the extracted topics are multi-word, the classification into specific FWC/PESTE categories needs interpretation. In this respect, word clouds of topics and a careful reading of the respective abstracts were needed. For example, topic number 12, “policy, governance, actor, rights, implementation, paper, country,” was interpreted as mostly about policies, and it was classified in the policy category. Topic 88, “carbon, sequestration, emission, stock, forest, mitigation, change,” was also about policies, which could be concluded only after reading the abstract. The other three policy topics were about land use (104), sustainability (151), and policy implementations (148, Appendix E).

The economic topics of theme 1 concerned research on timberland investment, wood production, and price risk (16, 24, 82). Social topics covered areas like local forest management and social preferences for forest-related programs (44, 70). Technological topics were about, e.g., forest inventory, thinning, and genetics (17, 43, 142). Ecological topics constituted the largest group covering research on forest management change, wood growth, climate change, diversity, and forest damage (topics 5, 8, 18, 37, 87).

The utilization of forest resources (theme 2) was dominated by economic topics. They were about tree selection in harvesting, harvesting costs (75, 121, 138), and the chipping costs of forest residues (132). Economic topics in this theme largely reflect the traditional role of wood procurement in the use of forests. However, topics on other uses of forests were also detected. They were about berries (4) and visitors (78). The only social topic found in this theme concerned the health and recreational benefits of forests (60), and the only technological topic was about forest planning (158). Neither ecological/environmental topics nor policy topics emerged.

In theme 3 (industry, processing, and products), technological topics dominated. Their focus was mainly on wood sciences covering the properties of wood materials and products. For example, topics on product quality were about wood-based panels (126, 150) and composites (14, 110). Topics related to strength properties concerned plywood stable joints, furniture frames, and wood adhesives (topics 11, 76, 115). Topics on protective treatments for wood materials were also found, including heating methods, preservatives, and microwaves (25, 136, 154). The economic area in this theme was represented by one topic, lumber processing in sawmill industries (39). The policy, social, and environmental topics did not emerge.

Theme 4 (end use, markets, and trade) included five topics, and it was the smallest theme. Economic topics were related to product markets, internet business, and price determination in international markets (topics 1, 114, and 129). Two sociological topics were also detected after a deeper analysis of the abstract documents. They were about corporate social performance and perceptions of wooden interiors (topics 85, 107). The policy, environmental, and technological topics could not be detected.

Network of Topics Co-appearing in the Abstracts

The CTM modeling allows document topic probabilities to be correlated, which means that positively correlated topics are more likely to be discussed within a document than negatively correlated topics. The network visualization of correlations among the topics (Fig. 3) is generated by topic modeling. Visualization helps to examine the topics’ structure. The topics belonging to same theme in our FWC framework tend to group together. This indicates a match between our manual classification and the grouping obtained from topic modeling. In Fig. 3, different colors are used to describe the groups/themes. The green group corresponds to theme 1 (forest resources and management), the red group to theme 2 (utilization of forests), the blue group to theme 3 (industry, processing, and products), and the yellow group to theme 4 (end use, markets, and trade). Topics in black are those that did not seem to belong any of the FWC themes in our classification (Appendix E). Note that only part of the topic structure (links with correlations ≥ 0.143 in absolute values) is presented in Fig. 3.

Fig. 3
figure 3

Network visualization of topic correlations for the period 2000–2019. Correlations ≥ 0.143 in absolute values are presented. The size of the circle depicts the number of links for a topic (degree). Correlations between topics are described by lines of different gray shades. Darker shades describe stronger correlations, and lighter shades weaker correlations. Negative correlations are indicated by dotted lines. The topic groups corresponding to the FWC themes are indicated with different colors

Of the four themes, end use, markets, and trade (depicted in yellow) stands out best from the network. All this theme’s five topics obtained from manual classification are included in Fig. 3. The green group, corresponding to theme 1, is formed around a few central topics: tree variation (15); stand structure (111); height (2); and growth (8). They represent traditional research areas related to forest resources and management. The blue group, corresponding to theme 3, is formed around technological topic 10, which concerns wood material. The red group, representing theme 2, is fragmented. Topic 160 (study, result, potential, provide, due, analysis, include) is also one of the central topics, including words typical of a scientific abstract. It does not specifically include any thematic information about the abstracts’ content.

The network of topics also highlights the interwoven structure of forest research between themes and topics. For example, there are certain topics that connect topic groups, like the blue technological topic 21 about forest residues for energy. It connects chipping costs of forest residues (132) and harvesting costs (121) from the red group and topic 24 about production from the green group. With the strongest correlation between topics 21 and 132, these topics naturally co-occur in the abstracts about wood energy. The connection between the topics of three themes (3, 2, and 1) can be interpreted to highlight active research on wood energy in technological and economic perspectives in the FWC framework.

Another topic connecting groups is 61 from the blue group. It concerns the technological measurements of wood material. It is central in the sense that it connects the separate blue topics 23, 39, and 46 with the green group. Topic 23 is about the grading of sawlogs, topic 39 about lumber, and topic 46 about the quality of wood material. All four topics are connected with the green cluster via topic 17 about forest inventory. This connection of two groups corresponding to themes 1 and 3 can be interpreted as highlighting active research on measurements in technological and economic perspectives in the FWC framework.

Negative correlations are shown in Fig. 3 with dotted lines. One interesting negative correlation is between green topic (5) about forest management change and blue topic (10) about wood material, which seems to divide topics between the two groups. This refers to different perspectives related to research on wood, which is an industrial material, as well as a natural resource.

Time Development of Forest Sector Research

The time development of forest sector research was analyzed by relative shares of topic groups corresponding to the FWC/PESTE classification (Table 3). Three timepoints were used in the analysis, based on the relative group share, i.e., the share of probability among the group’s topics from the periods 2000–2006, 2006–2013, and 2013–2019. In the following, some main trends are summarized.

According to the classifications, the first theme, forest resources and management, includes 68 ecological topics (Table 3). This group is the largest topic group, and it is called “theme_1_Ecol” in Fig. 4. The relative share of this group has increased during the research period. At the same time, the second largest group, the technological topics of industry, processing, and products (theme_3_Tech), has slightly lost shares. However, the trend changes in these two groups have been relatively small.

Fig. 4
figure 4

Time development of forest sector research from 2000 to 2019. Relative shares of topic groups corresponding to the FWC/PESTE classification are presented in three time-points, based on the periods 2000–2006, 2006–2013, and 2013–2019

A notable decrease was detected in the share of economic topics of theme 4, end use, markets, and trade, during the research period (theme_4_Econ). The decline was sharp from 2000–2006 to 2006–2013. The economic topics of industry, processing, and products (theme-3-Econ) declined especially in the second part of the period. The utilization of forests, theme 2, was the only theme in which economic topics gained share during the research period. However, the positive trend took place only from 2000–2006 to 2006–2013, after which the trend started to decline slightly in this group as well (theme_2_Econ).

One of the most important changes was in the position of sociological topics of forest resources and management (theme_1_Soc). The relative share of this group has risen notably during the research period, especially from 2000–2006 to 2006–2013. Positive trends indicate an increasing research interest in social issues, although the number of topics was small. The role of policy research remained minor in the present analysis. Policy topics were detected only in the first theme, forest resources and management, where there was a positive trend in the occurrence of research related to this issue (theme_1_Pol).

Discussion and Conclusions

The aim of the study was to gain an insight into the global academic forest sciences literature during the 2000–2019 period. The most frequent research subjects and research trends were identified using 14,470 abstracts of 15 leading peer-reviewed international journals. The topic-modeling method was applied to find the topics, after which they were classified into FWC themes and PESTE categories based on qualitative interpretations.

The key benefit of the topic-modeling method compared with commonly used human-assigned approaches is that it allows unsupervised mapping [e.g., 34]. Pre-defined, i.e., fixed keywords, that may condition results are not needed, because topics are extracted automatically in the statistical modeling. When keywords are not fixed, emerging new patterns in data can be discovered, thereby also allowing the possibility of identifying novel knowledge. The present study applied a correlated topic model, CTM [36], which allowed us to visualize the relationships between topics. Topic-modeling classification schemes are fruitful in examining correlations between topics, which helps to structure the data, whether this is undertaken using human or model-generated schemes. This also highlights the interwoven structure of forest research between themes and topics.

The perspective in this study is global, and it may therefore miss some aspects. The analysis was limited to scientific articles published in English, which may have restricted the emergence of some nationally or regionally important topics. Similarly, among the selected journals, not all regions may be equally represented, because the focus was on representing international forest sector research as comprehensively as possible, and geographical representativeness was only subsequently considered. Furthermore, part of the research may be published elsewhere than in scientific journals, in documents with restricted publicity, or in books. For example, policy research can be sensitive from the perspective of policy-relevant actors, in which case, funding may focus more on policy processes than on scientific publications. In addition, a large part of the economic or technological research on industry processes and products is probably published in specific journals not included in the present study, or their general distribution may be restricted.

According to the analysis, ecological research on forest resources and management (theme 1) was the largest, and a slightly increasing, research area. This is a traditional field of forest research covering several topics, including forest species, growth, inventory, planning, regeneration, plantation, diversity of species, and forest management. This research area will also probably be important in the future [18]. The increase in social and policy topics within theme 1 indicated increasing interest in these research areas, even if their number was small compared with the ecological topics. The rising trend in social research was especially noteworthy. This mirrors the international objectives set for forests to support livelihoods, income, and to fight poverty [5]. No increase in social research has been detected in previous studies [16•, 18]. However, our results were in line with a more recent study of [17••].

The second-largest research area was technological research on industry processing and products (theme 3), in which there was a slight reduction in share during the research period. According to the topic subjects in this area, the interest was mainly in material and product properties, different treatments of wood, and wood extracts. Many of the topics could be interpreted as belonging to the field of wood science, which, according to [18], has shown no specific trend growth either. However, as wood-based materials play an important role in the transition to a low-carbon bioeconomy, and wood-based raw materials may improve the sustainability of many present solutions [52], research in this area can be expected to grow in the future.

The remaining themes, i.e., utilization of forests (theme 2) and end use, markets, and trade (theme 4), account for a minor component of the forest-related research, at least when measured quantitatively. However, as these themes are vital, especially considering future challenges, there seems to be a significant gap in academic forest sector research. The few topics that were related to the utilization of forests (theme 2) mostly represented economic research, and even this research trend declined in importance during the second part of the study period. As the benefits that forests provide for human beings are large and diverse, the relative share of theme 2 research might have been assumed to be greater. The study of [17••] also observed a limited amount of literature on ecosystem services (which is part of our theme 2) within the forest research, although interest in this area is increasing. This increase may be stimulated by the requirement for more information on the capacity of forest ecosystems to sustain the present and future welfare of society [17••].

Similarly, the role of the research on end use, markets, and trade (theme 4) was smaller than one might have expected given the dramatic changes in the demand and market structures [53] during the last 15–20 years. Moreover, interest in economic and social research in this theme declined significantly during the first ten years of the study period, showing no subsequent recovery. As demand for forest products and wood material is growing, novel forest-based products are emerging, and changes in market demand will continue; it would be justified to direct more academic research to theme 4. Moreover, changes in markets are transmitted throughout the FWC, from customers to forests. Although large forest sector models describing the functioning of the whole FWC have produced global and regional market scenarios [53], research is needed on changing consumption and production patterns, as is also observed by [7].

Despite the drastic changes in the operational environment of the forest sector, such as changes in the end-use markets of forest products and the increased importance of global policy processes impacting the use of forest resources, the results of this review still reflect the traditions of forest sciences. In theme 1, ecological subjects comprise the majority of research on forest resources and management, while that relating to economic, sociological, and policy perspectives constitutes a minor proportion. In theme 2, research was mainly economic, and moreover, it was mostly related to timber and harvesting. Furthermore, [6] notes that timber production still tends to be the most recognized ecosystem service. Academic research on industry, processing, and products (theme 3) was strongly focused on technological research, although the importance of economic, sociological, and policy factors is increasing in this subject area, boosted by globalization, climate change, and the different dimensions of sustainability. According to [54], forest industries’ traditional production-orientation thinking is shifting to customers and markets. Yet, the role of markets for forest products (theme 4) appears to receive little attention in forest sciences, at least in academic publications.

Growing demand for forests indicates increasing opposition between different interest groups and between different uses of forests. There is also an increasing need for evidence-based policy, especially in issues such as climate change, biodiversity, and bioenergy [e.g., 55]. All this indicates a need for forest policy research in general and a need for cross-cutting policy research. According to this study, there has been little such academic research in the past, because policy topics were detected only in theme 1. However, effective policy decisions play a key role in solving global problems, and policy research should cover the entire forest sector and FWC, not only selected parts of it.

In summary, the results of the study indicate that ecological research on forest resources and technological research on industry and forest products are the main subjects investigated in forest science. A clear decline was detected in the research on end-product markets, even if changes there have been significant, and changing markets drive changes in the entire forest-wood chain. To support the goals of transition to a sustainable bioeconomy, it will be important to increase research on policy impacts, as well as social and ecological sustainability issues to cover all stages of the FWC more evenly.

The study’s limitations need to be considered in interpreting the results. However, we believe that the present results give an overview of worldwide academic forest sector research and insights into the existing trends, which may be of help in directing research work. In addition, the topic-modeling method gives an example of an unsupervised analysis tool in discovering topics. Because our results cover the years between 2000 and 2019, other information is also needed, such as foresight analyses and other knowledge on emerging issues. Moreover, given that research tendencies are dynamic, a regular analysis of forest research trends is necessary. It is needed to inform researchers in directing their work, research management in planning research calls and future projects, editors in adjusting their publication policy, and policymakers in defining research strategies.