Introduction

Haze, which is produced by smoke, fog, dust, and other tiny particles in the air, often occurs in the metropolis (Kim Oanh and Leelasakultum 2011; Yang et al. 2015). Haze is mainly composed of PM10 and PM2.5. Haze weather can not only reduce visibility, increase the frequency of traffic accidents, but also cause decline in air quality and induce respiratory and cardiovascular diseases (Hand et al. 2014; Zhang et al. 2015c; Fu and Chen 2017); The haze phenomenon can also have an impact on the earth’s climate effect by affecting the earth’s radiation budget situation (Davies and Unam 1999; Bytnerowicz et al. 2003; Tonnesen et al. 2003). Therefore, haze problem has attracted more and more attention.

Some scholars have analyzed the solutions of controlling haze weather (Gao 2008; Voiland 2010; Wang and Zheng 2013). Fu and Chen (2017) proposed the suggestions on future directions of haze pollutions in China by reviewing factors contributing to haze formation. Kulmala (2015) considered that the air pollution control remains a great challenge because urban air is a complex cocktail of chemicals whose poorly understood interactions and feedbacks may exacerbate health problems; Many researchers have examined the components of PM2.5 from the chemical and physical properties (Bates and Sizto 1987; Thurston et al. 1994; Ma et al. 2012; Jansen et al. 2014; Sun et al. 2015; Zhang et al. 2015b; Wu et al. 2017); There also have been many published papers which have revealed the characteristics of haze problem from the human health perspective (Davis et al. 2002; Tie et al. 2009; Liu et al. 2015; Ren et al. 2016).

As a statistical and visible approach on published papers, bibliometrics provides a way to analyze academic documents quantitatively (Mayr and Scharnhorst 2014; Chen et al. 2016). There have been a lot of studies which evaluate research relationships of authors, institutes, countries, etc. in specific research fields (Wang et al. 2010; Abramo, et al. 2011; Gupta and Bala 2012; Matthews 2013; Bajwa and Yaldram 2013; Li and Zhao 2015). In recent years, a great number of publications have been published on haze and related fields. There have been 5606 documents on haze in the Science Citation Index Expanded (SCI-Expanded) and Social Science Citation Index (SSCI) of the ISI-Thomson Reuters Scientific database from 2000 to 2016. Much attention has been paid to haze problem; however, few papers attempted to analyze and examine global academic publications data visually. Therefore, the present study is to reveal research patterns in the characteristics of author distribution, international collaboration, and academic relationship on haze research.

Methodology and data collection

Methodology

Bibliometric methods provide an approach to identify the development trends or future research orientations by analyzing the publication output, keywords, authors, institutes, countries (Li et al. 2015; Chen et al. 2016). The statistical results related to distribution of authors, institutes, countries/territories, and keywords can be visually showed by using bibliometric analysis tools including VOSviewer, Citespace, and HistCite.

CiteSpace is a scientific visualization software which is used for visualizing and mapping statistical publication data from the ISI-Thomson Reuters Scientific database. It is a freely available Java application for visualizing and analyzing trends and patterns in scientific literature. It focuses on finding pivotal points in the evolution of a research field. Providing various functions to facilitate the understanding and interpretation of network patterns, CiteSpace can identify the fast-growth topical areas; find citation hotspots in the assemblage of publications; and decompose a network into clusters, automatic label clusters with terms from citing articles, geospatial patterns of collaboration, and unique areas of international collaboration (Chen 2014). CiteSpace not only supports structural and temporal analyses of a variety of networks derived from scientific publications, including collaboration networks, author co-citation networks, and document co-citation networks, but also supports networks of hybrid node types such as terms, institutes, and countries, and hybrid link types such as co-citation, co-occurrence, and directed citing links (Chen 2004). The primary source of input data for CiteSpace is the Web of Science. CiteSpace will handle the data from there. Besides, CiteSpace can be used to generate geographic map overlays viewable in Google Earth based on the locations of authors (Chen 2006).

Dataset for visualization analysis

The data for the present study were collected in March 2016 from Web of Science (http://webofknowledge.com). In particular, the Science Citation Index Expanded (2000–2016), Social Science Citation Index (SSCI, 2000–2016), have been collected through the online documents published by Thomson Reuters. The data retrieval strategies were set as follows:

  • Topic = “Haze”; it means that the word in title, abstracts or keywords of articles will be retrieved.

  • Timespan = 2000–2016.

  • Five thousand six hundred six papers were collected in this study.

Parameter design

Time Slicing was set from 2000 to 2016. Years Per Slice was set 1. Term Source was set “Title,” “Abstract,” “Author Keyords (DE),” “Keywords Plus (ID).” Term Type was set “Burst Terms.” Node Types were set “Author,” “Institution,” “Country,” and “Keyword,” respectively. The size of circles represents the publication number, and the distance between two circles is inversely proportional to the collaboration between two authors, countries/territories, and institutes. Concretely, the shorter distance between two circles is, the more collaboration between two authors is.

The overall methodology is shown in Fig. 1.

Fig. 1
figure 1

Research methodology

Results and discussion

Publication year

From the period of 2000 to 2016, 5606 documents were published in the ISI-Thomson Reuters Scientific database. In 2000, 153 documents were published; the number of documents increased as 779 was in 2016. Yearly research outputs are shown in Fig. 2. Results revealed that the research on haze was nearly consistently the focus of scholars during the past 17 years.

Fig. 2
figure 2

Annual publication related to haze in the WOS core collection, published from 2000 to 2016

Most areas in China have forecasted haze weather as a kind of severe weather warning since 2011 (Zhang et al. 2015d; Gao et al. 2017). Owing to high levels of atmospheric pollutant emissions, more serious haze episodes occurs in China after 2013, especially in urban agglomerations such as the Beijing-Tianjin-Hebei region, the Yangtze River Delta area, and the Pearl River Delta area (Fu and Chen 2017; Li et al. 2017). In response to the extremely serious haze pollution, the Chinese State Council decided to reduce and control concentrations of PM2.5 (Wang et al. 2014; Zhang et al. 2015a). To achieve the goal, the Chinese government proposed 10 prevention measures for aerosol pollution control called Atmospheric Pollution Prevention and Control of the Ten Measures of China (http://www.gov.cn/gzdt/2013-09/16/content_2489162.htm). Those have obtained continuous attention among scholars to reduce emissions caused by aerosols with an emphasis on fossil fuel combustion, vehicle exhaust, and industrial waste gas (Guo et al. 2014; Zhang et al. 2015c). All of those may explain the reason why the publications related to haze began to have a high growth rate from 2013.

Authorship

The academic cooperative connections among authors generating research on haze were shown in Fig. 3. Tended to cooperate with small groups of collaborators, the authors generated several clusters. The top 30 most productive authors for total publications are shown in Table 1. The major academic contributions, which were concluded in terms of total publication frequency, primarily originated from Li J., Li L., Zhang Y. and Wang Y. As to the publication distribution of the top 30 most productive authors, 46 from Li J., followed by 29 from Li L, 29 from Zhang Y., 29 from Wang Y., 27 from Zhang Q., 26 from Cao J.J., 25 from Wang Z.F., 25 from McKay C.P., 22 from Rannou P., 21 from Wang Y.S., 21 from Wilson S.E., 21 from Chen J.M., 21 from Zhao Y., 21 from Wang H., 21 from Irwin P.G.J., 20 from Wang X.M., 20 from Balasubramanian R., 19 from Fortney J.J., 18 from Malm W.C., 18 from Chen Y., 18 from Zhang X.Y., 18 from Sun Y.L., 18 from He K.B., 17 from Coustenis A., 17 from Alio J.L., 17 from Baines K.H., 16 from Teanby N.A., 15 from Sotin C., 14 from Che H.Z., and 14 from Waters E.J.

Fig. 3
figure 3

Author co-citation map from 2000 to 2016

Table 1 The top 30 most productive authors

Countries/territories

To map the distribution of publications on haze, we obtained a network based on the author’s countries/territories by using CiteSpace. A network was displayed that including nodes and links representing the collaborations among countries/territories. Geographical map can be generated using Generate Google Earth Maps (KML 2.0) in CiteSpace after gaining the countries/territories co-citation results.

The academic cooperative connections among countries/territories generating research on haze were shown in Fig. 4a. The top 30 most productive countries/territories for total publications are shown in Table 2.The major academic contributions, which were concluded in terms of total publication frequency, primarily originated from the USA, China, Germany, and France. As to the publication distribution of the top 30 most productive countries/territories, 1925 are from the USA, followed by 1162 from China, 432 from Germany, 425 from France, 323 from England, 297 from South Korea, 268 from Italy, 259 from Japan, 223 from Canada, 197 from Spain, 179 from India, 162 from Australia, 158 from Taiwan, 114 from Netherlands, 108 from Switzerland, 95 from Singapore, 83 from Brazil, 68 from Turkey, 67 from Sweden, 63 from Malaysia, 62 from Norway, 61 from Russia, 59 from Belgium, 52 from Finland, 52 from Denmark, 49 from Greece, 45 from Israel, 43 from Austria, 43 from Portugal, and 38 from Poland. Most articles have been published from these countries.

Fig. 4
figure 4

a Country/territory co-citation map from 2000 to 2016. b Geographical map of countries/territory co-citation from 2000 to 2016

Table 2 The top 30 most productive countries/territories

We generated the geographical map of the author’s countries/territories by using countries/territories co-citation results through Generate Google Earth Maps (KML 2.0) (Fig. 4b). The figure showed that countries/territories in the northern hemisphere participating in haze research were more than that in the southern hemisphere. On one hand, academic cooperative connections among countries/territories in the northern hemisphere were relatively concentrated; on the other hand, in recent years, the fact that some countries/territories including China, India, Russia, South Korea, and Belgium (Yang et al. 2015) have been facing severe air pollution problem indirectly reflected that air quality in the northern hemisphere was worse than that in the southern hemisphere. As a result, scholars in the northern hemisphere took close attention to study air pollution problem including the haze phenomenon.

Institutions

The academic cooperative connections among institutes generating research on haze are shown in Fig. 5a. The top 30 most productive institutes for total publications are shown in Table 3.The major academic contributions, which were concluded in terms of total publication frequency, primarily originated from the Chinese Acad Sci, NASA, CALTECH, and Univ Arizona. As to the publication distribution of the top 30 most productive institutes, 347 are from the Chinese Acad Sci, followed by 233 from NASA (National Aeronautics and Space Administration), 154 from CALTECH (California Institute of Technology), 122 from Univ Arizona, 96 from Univ Maryland, 95 from Peking Univ, 76 from Univ Chinese Acad Sci, 74 from Nanjing Univ Informat Sci & Technol, 69 from Univ Colorado, 69 from Tsinghua Univ, 66 from Univ Paris 06, 66 from Fudan Univ, 59 from Univ Calif Berkeley, 55 from Natl Univ Singapore, 54 from Chinese Acad Meteorol Sci, 54 from Observ Paris, 53 from Cornell Univ, 50 from Univ Oxford, 50 from China Meteorol Adm, 49 from Johns Hopkins Univ, 49 from Beijing Normal Univ, 43 from Univ Calif Santa Cruz, 40 from NOAA(National Oceanic and Atmospheric Administration), 39 from Chinese Res Inst Environm Sci, 37 from Colorado State Univ, 35 from Univ Wisconsin, 32 from Harvard Univ, 31 from CSIC(Spanish National Research Council), 30 from Nanjing Univ, and 30 from CNRS (Centre National De La Recherche Scientifique).

Fig. 5
figure 5

(a) Institute co-citation map from 2000 to 2016 (b) Time zone view of the keywords co-citation map from 2000 to 2016

Table 3 The top 30 most productive institutes

Among the top 30 institutes, 14 are in the USA, 11 are in China, three are in France, and one each in Singapore and Spain. In the USA, the number of publications, NASA is top ranked; in China, Chinese Acad Sci is top ranked. Result shows that higher education institutes are a remarkable backbone of scientific research (Table 3).

Keywords

We can comprehend an understanding of the development of research topic through the keywords of an article (Chen et al. 2015). According to the annual snapshotsa developmental time zone of haze research is shown in Fig. 5b. Each keyword node is represented as tree rings and the rings and links are represented in a spectrum of colors corresponding to the years of the keywords’ appearance (Chen 2014). The major focuses of haze research evolved from 2000 to 2016. For example, scholars emphasized research on haze by using photorefractive keratectomy in situ keratomileusis and excimer laser in 2001 whereas in 2006 studies published mainly focused on haze formation air pollution and chemical composition. Besides, no new hot topics of research emerged in 2010.

Of all the words shown in Table 4, “haze,” with a frequency of 1088 in the network, and variants including “aerosol” (474), “atmosphere” (407), and “model” (306), are high-frequency keywords. The keywords “optical property” (294), “particle” (203), “emission” (194),“surface” (187), “pollution” (165), “PM2.5” (144), “visibility” (136), “chemistry” (125), “particulate matter” (120), “haze formation” (120), “temperature” (118),“PM25” (105), “air quality” (104), “impact” (100), “climate” (94), “source apportionment” (93) and “stability” (91), represent the contents of the haze research, such as formation, features and controlling strategies of haze; “China” (177) represents the study area of haze research; “model” (306), “photorefractive keratectomy” (290), “in situ keratomileusis” (255), “TITAN (Texas Instruments test analyzer)” (200), “excimer laser” (127), “myopia” (107), and “film” (100) represent the methods of haze research.

Table 4 The top 30 most productive keywords

Conclusions

Because of causing serious air quality problems, haze is an important research object to attract scholars’ attentions all over the world. The last decade has witnessed rapid development in the literature on haze; however, there have been few attempts to map the global research through the bibliometric approach. Therefore, understanding the research evolution and orientation in haze analysis becomes a pivotal goal for related researchers, countries/territories, and institutes. Based on 5604 documents on haze in the Science Citation Index Expanded (SCI-Expanded) and Social Science Citation Index (SSCI) of the ISI-Thomson Reuters Scientific database, research network patterns and hotspots about haze research were generated from 2000 to 2016. From the present study status, research on haze will continue to growth rapidly. According to the data from SCI-Expanded and SSCI database, the top five most productive authors, of which were Li J. with 46 articles, Li L. with 29, Zhang Y. with 29, Wang Y. with 29, Zhang Q. with 27, and Cao J.J. with 26, as well as other scholars in this domain, have made great contributions to haze research. The publications on haze research were primarily originated from the USA, China, Germany, and France. The dominant hot spots of haze research could be concluded as “aerosol,” “atmosphere,” “particle,” “PM2.5,” and “air quality” from 2000 to 2016. And these will still be the key issues in haze research in the future. All of these research findings could provide foundation to understand the research developing process and trends in haze analysis for researchers in the field of haze.

Yang et al. (2015) first examined publication share, growth rate, and top journals of research on haze by using scientometrics approach. However, the method of the paper is only statistical analysis and the study area is constrained within China. Therefore, there is still short of a historical and detailed evolution of haze research within the world.

This is the first comprehensive quantitative and qualitative bibliometric analysis of scientific documents in the field of haze research. The research findings, related to distribution of authors, institutes, countries/ territories, and keywords have been visually shown by using CiteSpace, identified the development trends or future research orientations. In addition, to provide evidence of combination between bibliometric analysis and geographical analysis, the geographical map of the author’s countries/territories was generated by using bibliometric results through Generate Google Earth Maps.

Bibliometric tools, including CiteSpace, Netdraw, VOSviewer, and HistCite, provide a comprehensive approach to identify the development trends or future research orientations by analyzing the publication output, keywords, authors, institutes, and countries/territories. Those analysis results related to one research field can not only provide references for scholars, but also help policy makers to recognize and evaluate the advanced international research organizations.

Based on the above analysis and discussion, the future studies should focus on the following aspects: (1) use various bibliometric tools to compare the current bibliometric approaches; (2) explore a suitable way to fully develop combination between bibliometric analysis and geographical analysis.