In what follows, we first report some basic statistics on the sample record and the results of the clustering procedure based on bibliographic coupling. Then, we present the results from the analysis of individual clusters in detail.
Basic statistics of the bibliographic record and of the clustering results
The search strategy applied in the compilation of our database did not constrain the basic bibliographic characteristics of the sample (such as document type, publication venues or publication year). The distribution of papers along the timeline is reported by Fig. 3. It can be seen that the growth of the volume of publications in this domain of research has been accelerated from the 2000s, and followed an exponential-like growth dynamics since the second Millennium. The distribution of sample papers among publication venues signals the prevalence of some journals, which already forecasts the prevalence of certain themes within the domain. Table 1 lists the ranking of journals found to be most frequently contributing to the discourse, based on the number of papers (journals represented with at least 5 papers are being shown). Clearly, the most dominant venues are concerned with health communication, closely followed by prominent LIS journals and the most prestigious outlets on general science communication. The list also reinforces the indirect strategy employed for database construction, as its objective was to navigate the connections between Library and Information Science and communication studies on science–society relationships.
The clustering procedure built upon bibliographic coupling organized the database into five clusters (after pruning the cluster structure to exclude some extremely small and incoherent satellite sets with less than 10 papers). The size distribution of clusters is set out in Table 2. The distribution is reasonably balanced, with 120 papers per cluster on average. The detailed analysis of clusters will be provided in the next sections.
The analysis of individual clusters
In what follows, each cluster will be subjected to a detailed analysis, applying a systematic and comparative interpretation of the network view and the tree view of the conceptual and thematic organization of the cluster. For the sake of commensurability, the results will be framed using the domain-relevant categories of the quasi content analysis introduced in the Methods section. For reading both structural views (network vs. tree), some further explanation is in order, as we enhanced both visualizations to increase their expressive potential. In case of the tree view, color-coded terminal nodes are being used to signal the relative frequency of concepts/terms within the cluster. The four colors (red, orange, yellow, blue) indicate the quartile in the frequency distribution (Q1, Q2, Q3, Q4) to which the term belongs. Hence, “red concepts” are the dominant ones, in terms of frequency, within the cluster, and so forth. As noted in the Methods section, labels of the same color on the tree form a community or coherent term group within the semantic network, according to the outcomes of the Louvain method. Within the network view, the size of nodes is proportional to their betweenness centrality, so that the most central concepts stand out. To facilitate interpretation, only the most central concepts are labelled (B > 0.1), while the tree view uncovers a more granular picture, showing a detailed environment of the most central concepts with still highly central terms (B > 0.05).
The first recognized cluster exhibits a rather clear conceptual structure (Fig. 4). From the two main branches of the tree, one includes risk perception, risk communication and health communication, as the most frequent concepts, which, together with trust, form a coherent subgroup within the cluster. On the other hand, cancer information overload is the dominant concept of the other main branch (both in terms of frequency and centrality). It connects many cancer-related issues to this discourse, including cancer worry, cancer fatalism, loosely paired up with diet as a health factor. This pattern is recurring, as the other subgroup of this branch also connects the topic of cancer and fatalism with another attitude detected in the public, namely, nutritional backlash. It becomes salient in the semantic network that it is through the notion of uncertainty where these two branches combine. Regarding the dimensions of the study on science-society communication, the discourse focuses on the aspects, namely public perception and public attitudes, and features a main theme, in that the uncertainties of cancer-related public knowledge is being addressed (together with parallel, nutrition-related ones). Though Methods are not salient, a central role can be attributed to the concept of measurement.
The second cluster we identified showed a much denser conceptual structure (Fig. 5). The tree is built up of several main branches. Around the most frequent concepts we find (1) a block focusing on twitter and media bias, loosely associated with framing; (2) world wide web and internet is coupled with computer-mediated communication, on one hand, and methodology, activism, and social network on the other. (3) A strong concept block incorporates newspaper and blog together with nanotechnology and science communication. The closest branch is (4) information management, which is the most central term to this discourse, and web data mining (or web mining). It is this aggregate, that channels in an extensive block of (5) information science and scientotmetric methods (see the semantic network visualization). This latter one includes citation analysis, co-word analysis, bibliometric approaches and network analysis. The network view reveals a certain circle, as the area designated mainly by science communication is connected to the concept of information science through scientometric terms in one direction, and the term information management in the other. With somewhat lower frequencies, but an equally important conceptual branch (6) is dominated by big data, that is linked to automated content analysis and communication. In this block, the concept of framing also appears (implicit framing), paired up with that of crisis communication. As to the dimensions of the study on science-society communication, this subdiscourse is clearly characterized by the unifying methodology of informetrics and data science, as well as the communication channel under study, being the world wide web and the internet-based platforms in general. The aspect of interaction is the dissemination of societally recognized scientific issues (as indicated by the role of bias and framing), with the recurring theme of nanotechnology.
The third cluster formed within the discourse also shows a rich structure (Fig. 6). The most dominant blocks are formed around (1) health that is closely associated with news source and agenda building, and also directly linked to science journalism and nanotechnology, each representing a frequent concept in this cluster. The other dominant subgroup (both in centralities and frequencies) includes (2) health journalism, content analysis and health information, more loosely coupled with cancer. This subgroup shares its root with a further salient subgroup, designated by (3) framing and renewable energy. The network view reveals that it is the concept of content analysis that, with a high overall centrality, connects the latter two subgroups, that constitute the two main poles of the semantic map. In the “northern” pole, the group of renewable energy is extended by a further one with (4) climate change and media coverage, belonging still to the first quartile of the keyword frequency distribution. In the “southern” pole, the “health and science journalism cluster” is also augmented with a group organized around (5) mass media, containing risk communication and attitude as further central concepts (occurring still with relatively high frequencies). The other wing of this branch is more moderate in quantities, featuring health communication, healthy behavior, along with a specific health issue, namely fetal alcohol spectrum disorder. The network view demonstrates two other subnetworks: (6) news media is surrounded by media industry, television, public health and mental illness, with moderately high frequencies, forming the “north western” part of the map. A relatively distinct “south eastern” peninsula emerges around (7) qualitative research and media analysis and print media (see the tree view) that are linked with such particular topics as alcohol policy or harms to other or stigmatisation.
In sum, this subdiscourse can be described by quite coherent in most dimensions of science-society communication. Regarding the communication channel, it is the printed and mass media, and journalism being studied. The communication aspect is the public dissemination and public attitudes towards generally recognized science-related issues. Rather overt is the dual thematic orientation of the cluster: the two main themes being health topics and health communication, on one hand, and outstanding environmental issues, on the other. The cluster is also specific on the methods, given the central role of content analysis and the massive reference to qualitative research designs.
The fourth cluster is a relatively small and straightforward one, both in terms of content and of structure (Fig. 7). The dominant concept is research impact, as it is the most abundant and the most central in the network, being in the closest connection with (1) a group constituted of payback framework, health gain and monetization. Belonging broadly to this group is the focal concept of policy impact through which societal impact is connected to the conceptual net. Another focal wing of this very subgroup is designated with applied research together with knowledge translation. The other main branch of the semantic network is (2) dedicated mainly to the concept of knowledge exchange and managament research (and impact in general). These focal concepts span a complete subnetwork (best indicated via the network view), where knowledge exchange is coupled directly with stakeholder engagement, and a further aggregate is formed out of knowledge intermediary, evaluation and impact. The network view is highly telling inasmuch as it depicts research impact brokering between the two major branches centered around (1) the social impacts of scientific research and (2) managing the social dissemination and use of the output of research. Regarding the dimensions of science–society communications, this cluster is not specific in many respects (such as theme, method), but very much so regarding the aspect, which is the social impact of scientific and scholarly activity with its various facets (societal, political, economic etc.). The communication channel tackled here can be formulated as the various ways of mediation between science and society, aiming at the utilization of research outputs (policies, knowledge intermediaries, stakeholder engagement etc.).
The fifth cluster is again a dense network of concepts, organized into relatively coherent subnetworks (Fig. 8). The most (globally) central term is science communication itself, for which (1) the narrower context is set up by sentiment analysis, public engagement and twitter. Surrounding this central theme, we can find several dominant groups: a massive branch is formed by (2) wikipedia (coupled with encyclopedia) that shares its root in the tree with (3) citation analysis and altmetric, and with a further subgroup being set up by outreach and astronomy, on one hand, and Youtube and video on the other. This collective, that broadly spans the “northern” area of the semantic map, is extended with a frequently recurring and central term, (4) TED talk—just to cite the most frequent representatives of all these subgroups. Next to these, a dominant block is designated by (5) social media. The central “science communication” block has a sibling branch, where (6) open access and scientific communication plays a central role, its sibling group made up of social network, institutional communication, and, though somewhat insignificant in frequency, science popularization, which, arguably, is the most characteristic concept of this cluster (see below). Last but not least, (7) a salient block is organized around the term webometric, being a focal concept according to the network view as well.
Regarding the dimensions of science–society communications, the content of the cluster is quite well articulated in many ways. The channels under study are clearly those specific on-line platforms that has become most utilized in science communication (types such as social media, or most popular instances such as Youtube and Wikipedia). These are the very channels that, at the same time, are also being targeted by “post-bibliometrics” attempts to detect broader research impact (altmetrics), which leads us to the aspects and methods featured in this cluster. As to the aspects, on one hand, it is science popularization, citizen science (indicated by such central terms as astronomy and public engagement) and open science are being put into focus as mediators of knowledge between science and society, while it is the detection of on-line research impact (“social impact”), on the other, as clearly conveyed by altmetrics as an often-reoccurring term. The methods are best represented by the block focusing of webometric approaches, which picture is complemented by such automated content analysis methods as sentiment analysis, applied in a web-based context.