Creating an academic landscape of sustainability science: an analysis of the citation network
- First Online:
- Cite this article as:
- Kajikawa, Y., Ohno, J., Takeda, Y. et al. Sustain Sci (2007) 2: 221. doi:10.1007/s11625-007-0027-8
- 4.1k Downloads
Sustainability is an important concept for society, economics, and the environment, with thousands of research papers published on the subject annually. As sustainability science becomes a distinctive research field, it is important to define sustainability clearly and grasp the entire structure, current status, and future directions of sustainability science. This paper provides an academic landscape of sustainability science by analyzing the citation network of papers published in academic journals. A topological clustering method is used to detect the sub-domains of sustainability science. Results show the existence of 15 main research clusters: Agriculture, Fisheries, Ecological Economics, Forestry (agroforestry), Forestry (tropical rain forest), Business, Tourism, Water, Forestry (biodiversity), Urban Planning, Rural Sociology, Energy, Health, Soil, and Wildlife. Agriculture, Fisheries, Ecological Economics, and Forestry (agroforestry) clusters are predominant among these. The Energy cluster is currently developing, as indicated by the age of papers in the cluster, although it has a relatively small number of papers. These results are compared with those obtained by natural language processing. Education, Biotechnology, Medical, Livestock, Climate Change, Welfare, and Livelihood clusters are uniquely extracted by natural language processing, because they are common topics across clusters in the citation network.
KeywordsSustainability science Research on research Citation network Overview map Network analysis
Sustainability is an important concept for society, economics, and the environment (Lélé 1991; Goodland 1995; Christensen et al. 1996). Although the essence of the concept of sustainability has a long history dating back to JS Mill and TR Malthus (Goodland 1995), it has not been a significant issue in its present context until recently. In their book The Limits to Growth, Meadows et al. (1972) warned that our future development is limited and constrained by the growing world population and the depletion of natural resources. For the further development of society we must seek growth in a sustainable manner, as envisioned by the World Commission on Environment and Development (WCED) (1987), which proposed the concept of sustainable development in Our Common Future (also known as the Brundtland Report). These two publications invoked public interest in sustainability and sustainable development, posing challenges such as the management of contractive problems, for example growth versus limits, intergenerational versus intragenerational equity, and individual versus collective interests (Dovers 1993).
A list of academic journals including “sustainable” or “sustainability” in their titles. The retrieval was performed using the Online Public Access Catalog (OPAC)
Journal of Sustainable Agriculture
Journal of Sustainable Forestry
Journal of Sustainable Tourism
The International Journal of Sustainable Development and World Ecology
Renewable & Sustainable Energy Reviews
The Journal of Sustainable Product Design
International Journal of Sustainable Development
Journal of Sustainable Development in Africa
Environment, Development and Sustainability
International Journal of Agricultural Resources, Governance and Ecology
International Journal of Sustainability in Higher Education
International Journal of Environment and Sustainable Development
International Journal of Technology Management & Sustainable Development
International Journal of Agricultural Sustainability
International Journal of Sustainable Energy
World Review of Science, Technology and Sustainable Development
Agronomy for Sustainable Development
Sustainability: Science, Practice, & Policy
There has been a long debate on the definition of sustainability (Brown et al. 1987; Barbier 1987; Simon 1989; Shearman 1990; Lélé 1991; Redclift 1992; Goodland 1995; Callicott and Mumford 1997). The Brundtland Report defined sustainable development as development that “meets the needs of the present generation without compromising the ability of future generations to meet their own needs.” Sustainability is lexically defined as “the ability to maintain something undiminished over some time period” (Lélé and Norgaard 1996). While sustainable development is associated with the human exploitation of nature, “sustainability” does not include such a connotation. In fact, the meaning of sustainability depends on the context, in which it is applied (Brown et al. 1987; Shearman 1990). We must keep in mind that sustainability is not a goal; it is a constraint on the achievement of other goals (Marcuse 1998). Marcuse (1998) gives the following example: a problem such as the world’s poor is not that their conditions cannot be sustained but that they should not be sustained. In short, sustainability is a prerequisite to attain a goal, which means different things to different people. Therefore, “sustainability” is polyphonic and polysemic, and the content may differ from context to context.
The vague definition of sustainability is not necessarily an obstacle at the nascent stage of research and development. To some extent, the value of the phrase lies in its broadness and its ability to stimulate vigorous and open discussion. It also allows people with conflicting positions in the environment-development debate to search for common ground, on which to compromise (Lélé 1991). In some situations, avoiding rigorous definition may have a fruitful outcome. WCED (1987) defined sustainable development in a manner that, although somewhat vague and inoperative, attracted wide attention and endorsement (Dovers 1993).
The vagueness in definition also conveys shortcomings in grasping the overall structure of sustainability science, however. It is, for example, difficult to answer the question “What is sustainability science, and what disciplines does it include?” Such a discourse is common for other young academic domains as seen in environmental studies (Soulé and Press 1998). Efforts to offer a comprehensive understanding and definition of a research domain have conventionally been made by domain experts. But grasping the current status of sustainability science has become an urgent task because of the growing body of publications as shown in Fig. 1.
To meet this challenge a computer-based approach can be used to complement the expert-based approach because it is compatible with the scale of information (Börner et al. 2003; Boyack et al. 2005). A citation-based approach, which is computer-based, operates on the assumption that citing and cited papers have similar research topics. By analyzing this citation network, we can comprehend the structure of a research domain constituting a larger volume of papers than we can read. In previous works, a citation-based approach has been applied to water resource management (Thelwall et al. 2006) and ecological economics (Costanza et al. 2004; Ma and Stern 2006). The objective of this paper is to provide an academic landscape of sustainability science by using citation network analysis as a computational support tool.
Data and method
Assuming sustainability science in its historical context and current state to be reflected in academic publications, we collected a set of academic publications including “sustainability” or “sustainable” in their titles, abstracts, and keywords. We collected citation data for those publications from the Science Citation Index (SCI) and the Social Sciences Citation Index (SSCI) compiled by the Institute for Scientific Information (ISI), because SCI and SSCI are two of the best sources of citation data. We used Web of Science, which is a Web-based user interface for ISI’s citation databases, and searched the papers using sustainab* as a query, where * represents a wildcard. The corpus thus acquired therefore contains papers that include both “sustainability” and “sustainable.” A total of 29,391 such papers were retrieved. We realized, however, that some of these papers might not be relevant to sustainability science because they were retrieved via the simple query described above. Therefore we focused on the maximum connected component, which currently consists of 9,973 papers. In other words, we regarded papers not citing other papers in the component as digressional from the mainstream of sustainability science and eliminated them. We checked whether those eliminated papers also formed a large network, but found that the second-largest connected component has only 35 nodes. We therefore considered it reasonable to focus on the maximum connected component to reveal the structure of sustainability science.
In addition to citation network analysis we used natural language processing (NLP) to analyze the structure of sustainability science. We employed NLP as a supplemental method for citation network analysis. As a citation network might have a citation bias, it was used to illuminate only one facet of sustainability science; NLP was expected to illustrate another facet. In NLP we first identified key terms that often appear in the abstracts of the 29,391 papers. We then measured the similarity between the extracted terms. Using the calculated similarity, the terms were merged into clusters. We analyzed those clusters on the assumption that they would reflect some aspect of the current status of sustainability science.
After term recognition we counted the occurrence of those terms in each abstract. We then expressed the result by using a vector space model (VSM) (Salton et al. 1975). VSM encodes a collection of documents by a term-document matrix whose [i, j]th element indicates the association between the i th term and the j th document. In our case, a term is a sequential word extracted by the NC-value method and a document is an abstract. We calculated the similarity between two terms by the cosine of the angle between their vectors. Briefly, we regarded the similarity of the terms to be high when they appeared in the same abstracts. Finally, those terms were clustered by the group average method using these cosine measures. After obtaining the clusters, we manually annotated the names of the clusters.
Results and discussion
Characteristics of the top 15 clusters in the citation network
Agriculture, Ecosystems & Environment
Journal of Sustainable Agriculture
Natural capital accounting
Field Crops Research
Nutrient Cycling in Agroecosystems
Forestry (tropical rain forest)
Forest Ecology and Management
Timber and non-timber forest
Strategic Management Journal
Sustainable competitive advantage
Journal of Business Ethics
Academy of Management Review
Ocean & Coastal Management
Annals of Tourism Research
Water Science and Technology
Hydrological Sciences Journal
Journal of Forestry
Canadian Journal of Forest Research
Landscape Urban Planning
Journal of Planning Education and Research
American Journal of Alternative Agriculture
International Journal of Hydrogen Energy
Health Policy and Planning
Social Science & Medicine
Tropical Medicine & International Health
Australian Journal of Soil Research
Indian Journal of Agronomy
Organic matter management
Grass and Forage Science
Geography in Higher Education
Biodiversity and Conservation
Cluster #5 is Forestry (tropical rain forest). Most papers in this cluster are written by authors in the US and discuss management and economic aspects of timber and non-timber forest products from tropical forests. Cluster #6 is the Business cluster, which is somewhat noisy because most papers discuss the sustainable competitive advantages of a firm. The topological position of the cluster in the citation network reflects this. Some papers definitely share the same context as the other categories, however, e.g. by linking environmental performance and economic performance. Cluster #7 is the Tourism cluster; the subject of sustainable tourism is controversial and the management of oceans and coasts in particular is deliberated. Cluster #8 is the Water cluster, in which wastewater treatment, water resource management, and the water cycle are key topics. It is noteworthy that China focuses on water research. Cluster #9 is the Forestry (biodiversity) cluster; Canada has the highest CWF and is predominant in this cluster. An important goal for research in the cluster is the conservation of biological diversity in forests.
Cluster #10 is the Urban Planning cluster, in which sustainable city and landscape planning are key topics. Social and political aspects of sustainability, for example planning and regulation, are also discussed. Cluster #11 is the Rural Sociology cluster, in which sustainability is closely associated with social issues. Key topics are agreement between the countries of the North and those of the South, rural development, local knowledge, and local food systems. Cluster #12 is the Energy cluster, which is the youngest among the top 15 clusters. In the Energy cluster no country has a value of CWF markedly higher than for other countries, which means that the sustainability of energy is a common and global problem, at least for the developed countries where scientific research is active. Cluster #13 is the Health cluster, in which the sustainability of health projects is discussed. The penetration of intervention into a population and community participation in health-care programs is essential for sustaining health. Cluster #14 is the Soil cluster. Compared with the Agriculture cluster the Soil cluster is more technology-focused. In journals with a high JWF, however, detection of this cluster may be because of an emphasis on regional agricultural systems or a citation bias by which researchers in each country cite journals of their own countries. Cluster #15 is the Wildlife cluster, in which the impact of commercial hunting on forest mammals is investigated. Subsistence hunting by inhabitants and the sustainability of wildlife, especially mammals threatened by game hunting, are investigated.
In the citation-based approach it is assumed that citing and cited papers have similar research topics. Citation behavior is motivated in different ways, however (MacRoberts and MacRoberts 1989), and the result therefore reflect the cognitive structure of scholars in each research domain (Kajikawa et al. 2006). In other words, the citation map can be depicted as a result that must take these different motivations—for example citing papers having similar research topics, unrelated but prominent papers, and self-citations—into consideration. We therefore used NLP as a supplemental method for citation network analysis. We shall now look at the results obtained by NLP.
Clusters extracted by natural language processing
Example of extracted terms
Education, training, learning, skill, school, university, innovation
Biotechnology, cell, protein, gene, cultivar, breeding, pesticide
Hospital, patient, care, disease, vaccine, infection, pathogen, insect, insecticide
Livestock, rangeland, grassland, pasture, forage, cattle, sheep
Water, river, groundwater, aquifer, wastewater, effluent, drainage
Ecology, economics, regulation, legislation, profitability, tourism
Climate change, biosphere, planet, pollution, CO2, temperature, emission
Energy, fuel, electricity, oil, hydrogen, biomass, vehicle, recycling
Forestry, tree, timber, planting, fire, vegetation, logging, plantation, fire
Fishery, fish, fishing, ocean, sea, aquaculture, catch, harvest
Crop, rice, corn, plant, soil, fertility, nutrient, cultivation, erosion, topsoil
Business, company, firm, customer, competitiveness, capability
Welfare, well-being, safety, health, food, nutrition, diet, consumer
Livelihood, income, household, poverty, family, employment, consumption
Capital, market, investment, price, benefit, cost, labor, incentive, value
Biodiversity, wildlife, hunting, preservation, bird, landscape, ecosystem
Sustainability, society, nature, future, goal, assessment, solution, system
Interview, questionnaire, simulation, scenario, survey, database, history
Access, dynamics, program, question, debate, trend, structure, contribution
Comparing the results obtained by NLP (Table 3) with those by citation network analysis (Table 2), we can see similar clusters. Natural resource-related clusters such as Agriculture, Fisheries, Forestry, Water, and Biodiversity are extracted by both citation network analysis and NLP. These clusters are the central research domains of sustainability science. Clusters relating to Economics (Ecological Economics and Business) are also seen in both results. But some discrepancies exist. For example, the Tourism cluster in the citation network seems to be merged into the Ecological Economics cluster (cluster D) in NLP. This is because in the Tourism cluster the focus of discussion is often on its economic aspects. In NLP we have only one Forestry cluster (cluster F) whereas in the citation network there are three Forestry-related clusters (#4, #5, #9). This suggests the existence of a common terminology for these forestry research domains.
In addition to common clusters, some new clusters can be detected by NLP. These are clusters A1–A3 (Education, Biotechnology, Medical), B (Livestock), E1 (Climate Change), I2 (Welfare), and I3 (Livelihood). Clusters A1–A3 are closely related to each other as shown in the dendrogram (Fig. 7). These clusters are associated with the Health cluster (#13) in the citation network, which implies that education and biotechnology are mainly discussed in the context of the sustainability of health programs. Cluster E2 (Climate Change) is close to cluster E1 (Energy) in the dendrogram. Climate change cannot be detected as a distinct cluster in the citation network but appears in the dendrogram by NLP at this position. Clusters I2 (Welfare) and I3 (Livelihood) are also detected as distinct clusters.
Why do these clusters emerge? One explanation is that these clusters have terms that appear in most of the clusters in the citation network but with few appearances in each cluster. As terms such as education and welfare appear in each citation cluster in small quantities, we cannot detect them as independent clusters by citation network analysis. Nevertheless, distinct clusters are shown by NLP because these terms appear in large quantities across the entire corpus. These clusters that were originally extracted by NLP are therefore considered to be common terms for clusters in the citation network. Common clusters seem to be formed around topics representing what we should sustain: agriculture, fish, water, forests, energy, biodiversity. Some of the clusters that originally appear in the citation network are sub-categories such as soil and wildlife. Clusters originally detected by NLP are more common and more human-rooted, for example welfare, livelihood, and education.
Finally, let us address the limitations of our research. In our approach, we collected the corpus by making a query. The results obtained by citation network analysis indicated that agriculture and fisheries occupy the largest fractions of sustainability science. On the other hand, energy, which is an unquestionably important area of research in sustainability, represents a relatively small fraction of research and is the youngest among the top 15 clusters. But we must note that usage among researchers of the term “sustainability” has been changing. Sustainability was used as a technical term in the early days but nowadays seems to be used to express the importance of global sustainability. It is plausible that clusters with a longer history (e.g. agriculture) have used “sustainability” as a technical term while the younger Energy cluster uses “sustainability” with the latter meaning. Therefore, changes in the definition of sustainability (or the usage of this word) may be behind these results. Debate on the definition and targets of sustainability will continue as a part of sustainability science.
Although sustainability is an important concept for society, economics, and the environment, its definition is unclear. The number of journals and papers on sustainability continues, nevertheless, to increase. For example, there are several journals on sustainability specializing in sub-domains of sustainability, for example agriculture, forestry, tourism, energy, and education. Over 3,000 papers on sustainability are currently published annually. Sustainability science is expected to integrate these sub-domains and to offer forums for discussion addressing the polyphonic and polysemic nature of sustainability.
This paper analyzed the current status of sustainability science and used a computer-based approach to provide a fundamental framework for future research. In this paper we visualized the structure of sustainability science by analysis of citations in relevant publications, and used a topological clustering method to detect the sub-domains of sustainability science.
Our citation analysis extracted 15 main research domains: Agriculture, Fisheries, Ecological Economics, Forestry (agroforestry), Forestry (tropical rain forest), Business, Tourism, Water, Forestry (biodiversity), Urban Planning, Rural Sociology, Energy, Health, Soil, and Wildlife. Agriculture, Fisheries, Ecological Economics, and Forestry (agroforestry) clusters are predominant among these. The Energy cluster is currently developing. These results were compared with those obtained by natural language processing. Education, Biotechnology, Medical, Livestock, Climate Change, Welfare, and Livelihood clusters were uniquely extracted by natural language processing, because they are common topics across other sub-domains of sustainability science.
We hope that the journal Sustainability Science publishes updates on achievements in each domain and facilitates interdisciplinary quests, multidisciplinary efforts to integrate these, and transdisciplinary actions to change the real world. We also hope that our landscape serves to guide those who contribute to sustainability science and helps them move society in sustainable directions, based on a clear grasp of their current position and new directions to explore.
We are grateful to anonymous referees for their valuable comments, owing to which our manuscript was greatly improved. We thank Associate Professor Hideki Mima for his help in natural language processing. We also thank Dr Ai Hiramatsu, Professor Yoshihisa Murasawa, and Professor Shuichiro Asao at The University of Tokyo for encouraging our research. This research was partially supported by the Ministry of Education, Science, Sports and Culture, Grant-in-Aid for Young Scientists (B), 18700240, 2006.