Introduction: mapping the multi-layer structure of science

Recently, historians of science have proposed to model knowledge systems, such as science and scholarship, as multi-layer networks made of three inter-related networks (Lalli et al., 2020; Renn et al., 2016). The first and more abstract network, called the semantic network, includes scientific concepts, theories, research methods, experimental protocols, and other epistemic contents, along with their semantic relations. The structures in the semantic layer form the epistemic or intellectual organization of science and scholarship; they are variously called knowledge domains, research areas, or scientific topics (Börner et al., 2005). These epistemic structures, however, do not exist independently from an underlying social organization. Scientific ideas are embedded in a complex social network made of scientists, research centers, universities, and funding agencies that allow the construction, reproduction, and transmission of epistemic contents (Knorr-Cetina, 2003; Latour, 2003). This collection of relations involving individuals and institutions form the social layer of science. The structures in the semantic network frequently have a counterpart in the social network: for instance, knowledge domains are developed in the context of academic disciplines (Sugimoto & Weingart, 2015), whereas the group of scientists that work on specific topics form a so-called invisible college (Crane, 1972; Zuccala, 2006). From a dynamical point of view, the emergence of new scientific areas is mirrored by corresponding changes in scientists’ social ties within the scientific community (Bettencourt et al., 2008; Lalli et al., 2020). In between the semantic and social networks, the intermediate layer of the semiotic network plays a pivotal role. It encompasses the array of material supports of knowledge that embody specific knowledge elements, including libraries, textbooks, laboratory notes, scientific instruments, and so on. The semiotic layer can be considered as the junction between the abstract layer of epistemic contents and the social layer of researchers (Petrovich, 2019).

In the last fifty years, researchers in disciplines such as scientometrics, bibliometrics, and library and information sciences have developed powerful quantitative techniques to analyze the semiotic layer. These techniques focus on publications and citations, which are the main traces left by scientists and scholars in the communication system of science, i.e., the system of papers, journals, books, and collections that allows the exchange of scientific information (Leydesdorff et al., 2017; Lucio-Arias & Leydesdorff, 2009). The research area of science mapping, which has flourished enormously in the last twenty years at the crossroad between scientometrics, networks science, and information visualization, focuses specifically on the visualization of the various semiotic networks that can be extracted from publications and citations. The key insight behind science mapping is that the structures found in these networks provide important information on both the sociological and epistemological aspects of science (Börner, 2010; Börner et al., 2005; Chen, 2013).Footnote 1 The collaboration between science mapping experts and experts in the analyzed field (the so-called domain experts) allows to produce an interpretation of the science maps in which the visible features of the semiotic layer are associated with elements or structures in the semantic or social layer of science. For instance, the semantic meaning of a cluster of co-cited papers (which is a semiotic structure) can be interpreted by the domain expert as a sub-field or a methodology on the basis of their knowledge of the content of the papers.

From this point of view, the components of the semiotic network that are deemed most relevant for individuating structures in the semantic layer are the textual elements of scientific publications, such as keywords, titles, abstracts, even the full-texts of research articles, and the relation of citation between publications. Terms can be associated with concepts in the semantic layers and frequently co-occurring terms to semantic entities such as research programmes or methodological frameworks (Callon et al., 1991; Thijs, 2019). Recently, text-mining approaches based on machine learning algorithms, such as topic modelling, have also been used to reconstruct latent topics in huge sets of publications (see e.g., Ambrosino et al., 2018). Citations, on the other hand, are particularly suitable for tracing epistemic relationships because they attest a transfer of knowledge from the cited to the citing publication (Hyland, 1999; Leydesdorff & Amsterdamska, 1990; Petrovich, 2018). It has been shown that epistemic structures such as scientific specialties are visible as dense areas in the citation network of science (Waltman & van Eck, 2012) and citations have been variously used to measure the intellectual similarity between publications (see Sect. 2 below).

As to the social layer of science, however, science mapping offers a comparatively less rich toolbox of techniques. In fact, the main technique used to reconstruct social structures in the scientific community has been co-authorship analysis, based on the idea that scientific authors are the key actors in the social network and that coauthoring a publication attests a relation of scientific collaboration between researchers.Footnote 2 The success of co-authorship analysis is also due to the easy availability of authorship data, which can be retrieved from large multi-disciplinary databases, such as Web of Science or Scopus, and usually also from more specialized bibliographic databases (see e.g., Newman, 2001). This type of analysis, however, suffers from some well-known limitations related to the practice of authorship in science and scholarship (Laudel, 2002).

First, it is difficult to reconstruct from the author list the contribution of the different authors, which may vary considerably. In some scientific fields, providing funding for the research is sufficient to grant authorship to the laboratory director, even if she has not contributed to the research process or performed the experiments (Larivière et al., 2016). Consequently, the variety of social relationships existing between authors, e.g., hierarchical relations, may get lost when raw co-authorship data are analyzed. This information can sometimes be restored based on authorship conventions. For instance, some disciplines, especially in bio-medical areas, adopt strict rules on the order of authors for accounting for the different roles. This convention, however, is not widespread in every scientific field. Physicists, for instance, rely strictly on the alphabetical order of authors (Galison, 2003). To reconstruct the different roles of authors, some scientific journals such as PLOS ONE have recently adopted fixed taxonomies (e.g., the CRediT system) that allow each author to specify their contribution to research. Authorship conventions and technical innovations may partly resolve the issue of weighting the role of authors in multi-authored publications and, hence, allow for more fine-grained co-authorship analyses (see e.g., Larivière et al., 2021).

Nonetheless, some academic malpractices such as “gift” or “honorary” authorships may still limit the validity of co-authorship data. Gift authorships are given to influential scientists for purely honorary reasons, artificially increasing their weight and centrality in the co-authorship network (Wislar et al., 2011), but also under the pressure of research evaluation systems that reward the raw output of researchers (Abramo et al., 2019). Conversely, “ghost authorship” occurs when the contributions to scientific articles are not rewarded with authorship. Laboratory technicians, statisticians, and sometimes even graduate students, for instance, are rarely counted among the authors of scientific publications (Fogarty, 2020; Shapin, 1989).

More generally, the use of co-authorship data faces the great variety of ways in which being an author is declined in science and scholarship (Biagioli & Galison, 2003). Fields such as high-energy physics, where gigantic experiments are performed in the context of multi-national collaborations such as CERN, are used to the phenomenon of “hyper-authorship”, with scientific publications counting thousands of authors (Cronin, 2001). The paper that reported the discovery of the Higgs boson, for instance, counted more than five thousand authors (Castelvecchi, 2015). Fields like these present very dense co-authorship networks that almost cover the entire array of researchers in a specialty and hence are not very useful for fine-grained analysis. At the opposite extreme of the scale, the social sciences and, most of all, the humanities still witness the prevalence of the single author, making the co-authorship networks too sparse to be informative (Díaz-Faes & Bordons, 2017). In these fields, the wide network of scholars that collaborate to the production of knowledge may be obscured by the prevalence of single authorship (Paul-Hus, Díaz-Faes, et al., 2017; Paul-Hus, Mongeon, et al., 2017).

In the light of these limitations of co-authorship, scholars in scientometrics and library and information science have started to analyze another element of the academic publication to get a richer picture of the social layer of science and scholarship: the acknowledgments. The acknowledgments, which originate from the covering letters that scientists attached to their articles to thank patrons and benefactors, have become a standard element of scientific articles during the second half of the Twentieth century, especially in the English-speaking world (Salager-Meyer et al., 2009). These texts are frequently a rich source of information about the social context surrounding a scientific publication, as they usually mention the funding agencies that provided financial support to the research, the congresses where previous versions of the article were discussed, and, most importantly, the peers and other persons that contributed in various ways to the publications. Such contributions may include the provision of materials and technical skills, moral support and mentorship, intellectual feedback or commentaries, editorial and linguistic assistance (Cronin, 1995, 2001; Cronin & Overfelt, 1994). In the last forty years, the acknowledgments in various disciplines have been analyzed, including computer science (Giles & Councill, 2004), bio-medicine (Cronin & Franks, 2006; McCain, 1991, 2018; Salager-Meyer et al., 2009), chemistry (Cronin et al., 2004), psychology (Cronin et al., 2003), sociology (Cronin et al., 1993), economics (Berg & Faria, 2008; Brown, 2005; Laband & Tollison, 2000; Rose, 2018), and philosophy (Cronin et al., 2003; Petrovich, forthcoming). Recently, text mining methodologies have been used to analyze large-scale sets of publications, including more than 1 million acknowledgments (Paul-Hus, Díaz-Faes, et al., 2017; Paul-Hus, Mongeon, et al., 2017). These studies show that the acknowledgments shed light on the intricate web of socio-cognitive ties within scientific communities, revealing patterns of “sub-authorship” that can complement classic co-authorship data (Cronin, 2004; Patel, 1973). In this sense, the acknowledgments seem the ideal source of data to map that informal segment of the social layer of research fields that remains invisible to standard co-authorship analysis. Acknowledgments may be particularly useful for fields in the social sciences and humanities, where, as noted above, co-authorship data are excessively sparse (Díaz-Faes & Bordons, 2017).

Drawing on the literature on acknowledgments studies, the aim of this paper is to present a new method for mapping the social structure of scientific and scholarly fields based on the acknowledgments of academic publications.

The rest of the paper is organized as follows. In Sect. 2, the function of the acknowledgments in the scholarly communication system is discussed. The standard normative account from library and information science is presented and then the idea of acknowledgments as positioning signals is introduced. Section 3 presents the formal definition of the acknowledgments-based networks that constitute the core of the method. Next, to assess the viability and interest of this new method, Sect. 4 describes its application to a humanities field, i.e., analytic philosophy. Results are compared to the co-authorship network of the same field and the outcome of citation-based mapping techniques is related to the social structures emerging from acknowledgments-based mapping. Lastly, Sect. 5 concludes by discussing in which sense acknowledgments-based networks are a special kind of social network, the practical limitations of acknowledgment mapping, and some possible future research lines.

The function of the acknowledgments in the scholarly communication system

In this section, the function of the acknowledgments in the scholarly communication system is discussed, presenting first the standard account in library and information sciences and then a refined account that conceives the acknowledgments as positioning signals. This section discusses the rationale for engaging in acknowledgments-based networks and how they can be interpreted. However, it is pivotal to highlight that the formal construction of the acknowledgments-based networks presented in Sect. 3 does not depend on the framework advanced in this section, i.e., it is compatible with alternative views on the function of the acknowledgments.

The standard account

In library and information science, the standard account of the acknowledgments assumes that they serve to reward informal collaborators. Specifically, its main tenet is that the acknowledging behavior of researchers is governed by a set of norms and common practices, which prescribe to reward informal contributions by public mention in the academic publications (Cronin, 1995; Cronin & Overfelt, 1994; Cronin & Weaver-Wozniak, 1995). These norms would be part of the tacit knowledge that is transmitted from mentors to junior researchers during their socialization in the academic community. Surveys of scientists and scholars from different disciplines showed indeed that authors subscribe to the idea of an etiquette governing the writing of acknowledgments (Cronin & Overfelt, 1994). Authors share an expectation of being acknowledged for contributions to the work of others and likewise expect to give acknowledgments under certain conditions. Codifications of the acknowledgment norms can be sometimes found in scientific journals, which provide detailed guidelines on who should be included in the author list and who in the acknowledgments (see e.g. ICMJE, 2019).

Acknowledgment norms, therefore, provide the answer to the question: Why does a certain publication mention a certain set of acknowledgees and not others? They guarantee that an author mentions in the acknowledgments of her publication only those that provided some sort of contribution to the publication itself. Hence, they legitimate the following inference: if A mentioned B in the acknowledgments of publication P, then B provided some sort of contribution (intellectual feedback, moral support, editorial assistance, etc.) to A’s publication P. This inference has three important consequences: (i) the link “P mentions B” can be reinterpreted in the opposite direction as “B contributed to P”; (ii) B can be considered as a sub-author of P (Patel, 1973); and (iii) A and B relationship can be treated as an informal scientific collaboration (Laband & Tollison, 2000). According to this normative framework, hence, the relation expressed in the acknowledgments would be structurally similar to co-authorship, with the only difference that the former would result from informal collaboration, whereas the latter from formal or more structured forms of collaboration.

A closer look, however, reveals that the two concepts are characterized by at least two other differences. First, all the authors of a publication know, at least in principle, who the other co-authors are. All the co-authors know that they are co-authors of the same publication, i.e., relationship of co-authorship, is known by all the participants in the relationship itself.Footnote 3 By contrast, a person can be mentioned in the acknowledgements of a publication without knowing it and without knowing who the other persons mentioned are. Second, co-authorship links are symmetric, i.e., do not have a direction, whereas mention links are asymmetric, i.e., they have a source and a target. The direction of the mention link can be reversed (e.g., “A acknowledges B” can be reversed in “B helped A”) but cannot run in both directions at the same time. This means that to individuate a reciprocal relationship, two mention links are required (i.e., “A mentions B” and “B mentions A”, or “B helped A” and “A helped B”), whereas only one co-authorship link is needed. If we deem the reciprocal exchange between collaborators an important feature of scientific collaboration (see e.g., Huebner et al., 2018), then a single mention link between A and B is not enough for properly speaking of scientific collaboration because it does not guarantee reciprocity between A and B.

These two characteristics of acknowledgment mentions make them closer to citations rather than to co-authorships, as authors can be cited without knowing it and citations have always a direction, going from the citing to the cited document. In this sense, Edge (1979) has defined mentions in the acknowledgements “super-citations” (p. 106). If acknowledgments are considered as a type of citation rather than as a type of co-authorship, the following question arises naturally: Why do the acknowledgees, i.e., the persons who are mentioned in the acknowledgments, receive different numbers of mentions? Which property of the acknowledgees do mentions reflect, if any?

The standard account, again, appeals to acknowledgement norms to answer this question. In this sense, the normative framework can explain both acknowledgments-as-co-authorship and acknowledgments-as-citations. If the norms prescribe the authors to reward informal contributors in the acknowledgments, then a frequently mentioned researcher is a researcher that has given her contribution to many publications, helping many colleagues. Mentions, hence, should be considered a measure of the “helpfulness” of researchers to the scientific community, i.e., of her propensity to collaborate (see e.g., Oettl, 2012). In this perspective, the acknowledgments represent a fundamental component of the reward system of science, being the third vertex of the “reward triangle of science”, along with citations and authorships (Cronin & Weaver-Wozniak, 1995). They would be a further mechanism used by researchers to distribute prestige in the scientific community.

Strategic factors in the acknowledgment behaviour

As it should be clear by now, the standard account based on acknowledgment norms relies crucially on a hypothesis about the motivations that guide the acknowledgment behaviour of authors, namely that these motivations would be coherent with the acknowledgment norms diffused in the community. However, it can be argued that other, more strategic or opportunistic motivations may lie behind the concrete choices the authors made. In particular, some researchers have argued that the acknowledgees are chosen based on the effect they have on the readers, editors, or even referees, rather than to reward informal contributions (Berg & Faria, 2008). In this sense, respected scholars may be mentioned to increase the perceived quality of the article: by trading on the prestige of the persons they mention, authors would increase their status in the field, augmenting the chances of being read and cited (Coates, 1999; Giannoni, 2002):

Just as the underlying reasons to cite can be diverse, noble, or self-serving, the motivations to acknowledge the support of others can range from flattery and name dropping to the sincere or required demonstration of gratitude upon individuals, organizations, or funding agencies (Desrochers et al., 2018, p. 233).

This means that, in the case of strategic acknowledgments, the inference from (i) “A mentions B” to (ii) “B contributed to A” is no more justified. In particular, (ii) should be replaced with (ii’) “B represents a source of academic or symbolic legitimization for A”.

Moreover, if mentions are not given because of credit recognition but because of prestige and symbolic capital, then the number of mentions received by an acknowledgee cannot be interpreted directly as a measure of helpfulness. Rather, mentions would indicate academic prestige per se, originating from other factors than the propensity to collaboration and “helpfulness”.

Now, it is highly implausible that all the mentions in the acknowledgments obey strategic considerations.Footnote 4 However, it is well known that mentions are concentrated in relatively few acknowledgees, i.e., the distribution of mentions in the scholarly communities is highly skewed (Baccini & Petrovich, 2021; Giles & Councill, 2004; Petrovich, forthcoming). From this point of view, hence, it is plausible that the highly mentioned acknowledgees receive a bonus of mention due to the fact that they have been already mentioned, i.e., that a cumulative advantage mechanism is active in the community (Merton, 1988; Price, 1976). If prestige feeds prestige, the mention of these acknowledgees becomes more and more a valuable signal for strategic purposes and, therefore, strategic mentions become more likely. The strategic components of the acknowledgment behaviour, hence, cannot be excluded, leading to the idea that mentions may reflect both propensity to informal collaboration and prominence in the academic field (Bourdieu, 2008). In this sense, they would be similar to citations, which reflect both scientific quality and academic visibility (Aksnes & Rip, 2009; Aksnes et al., 2019).

Acknowledgments as positioning signals

The integration of strategic motivations in the explanation of mentions allows us to better understand the links between (the authors of) publication P and the acknowledgees it mentions. Ideally, they should be supplemented with a qualitative dimension accounting for the type of relation existing between the (authors of) the publication and the acknowledgees they mention. Such a relation can be of informal contribution, when the acknowledgee actually contributed to the publication, but also of symbolic alliance, when the authors rather want to signal their tie with the acknowledgee. Note that this qualitative dimension cannot be easily translated into a number (a weight of the link) because the two qualities are not mutually excluding: the same acknowledgee may be mentioned to reward her contribution and to trade on her prestige. In practice, reconstructing this qualitative dimension may be problematic, as motivations behind the acknowledgments are private mental states (Cronin, 1984). Interviews and surveys may help to access them but are not immune from well-known methodological issues: researchers may not be fully aware of or candid about their motivations, their memory can be at fault, or the offered motivations may be post-hoc rationalizations (Baldi, 1998; Bornmann & Daniel, 2008).

However, in the perspective of building a more complete account of the function of the acknowledgments, reminding that they do not carry only informal collaborations but also prestige and symbolic capital is important to better characterize the kind of social network that is generated by using the acknowledgments as source of data.

This social network cannot be simply intended, as in the standard account coming from library and information science, as the informal complement of the co-authorship network because, first, the relationship between authors and acknowledgees may not be limited to collaboration and, second, they mention-link does not automatically imply reciprocity. We propose then to interpret the relationship as a signal of alliance that the authors send to their community of peers. By mentioning certain persons in the acknowledgments of their publications, the authors show publicly that they are associated with certain figures of the field. In the words of Cronin, authors use the acknowledgments to externalize their «tribal affiliations and loyalties» (Cronin, 1995, p. 72), i.e., their affiliation to certain intellectual schools or academic circles. This signal, however, should not be intended restrictively as a means to obtain a certain benefit (receive a preferential treatment by editors or attract more citations), as a purely strategic account of the acknowledgments would assume (Baccini & Petrovich, 2021). Rather, the signalling of allies should be intended as a way of positioning in the scientific or scholarly community, i.e., as a way in which the authors specify their position with respect to other social actors in their field. In this sense, the acknowledgments contribute to building a shared system of social coordinates within the social layer, in which the authors can position themselves.

Note that the concept of the acknowledgments as positioning signals shifts crucially the focus of acknowledgments theory from the motivations behind the acknowledgments to the public nature of the acknowledgments. If the formers are ultimately not accessible and hence matter of speculation, the latter is intrinsic to the fact that the acknowledgments appear in artifacts, the scientific publications, that are meant to have a public consisting at least in the peers of the authors. In other terms, this framework insists on the communicative nature intrinsic to the acknowledgments, which are intended as a communication act, i.e., a signal, from an author to the community to which she belongs. From a theoretical point of view, it allows dropping the hypothesis of the acknowledgments norms as foundations of the acknowledgments theory, even if it recognizes that the alliance between authors and acknowledgees may derive from the informal collaboration. Other types of ties, however, are considered as well, such as academic, social, and symbolic ties. Also, it is recognized that the same tie may be of multiple types at the same time.

Another advantage of the framework of acknowledgments as positioning signals is that it naturally covers the non-human entities mentioned in the acknowledgments, such as funding bodies and congresses. Talking of “informal collaboration” with reference to these entities seems in fact misguided. By contrast, their role is clarified if they are intended as part of the signal that the authors send through the acknowledgments. By associating themselves with prestigious institutions or powerful funders, the authors situate themselves in a wider network of alliances in the sense of Latour (2005).

The framework allows also refining the interpretation of the highly mentioned acknowledgees. In the normative account, they are interpreted as altruistic researchers with a high propensity to collaborate (see e.g., Laband & Tollison, 2003). In this new framework, they should be intended as social actors that many researchers want to “recruit” among their allies. They are therefore likely to be crucial actors in the sociology of the field, who may play the roles of gatekeepers or brokers among different communities. Exploring whether their centrality in the acknowledgments network is reflected or not in formal roles in the community is then an important topic for research. Among the highly mentioned acknowledgees it is, for instance, common to find the editors of prestigious journals (Rose & Georg, 2021), but also researchers that received important scientific prizes (Baccini & Petrovich, 2021). At the same time, there are actors that do not have these formal roles of gatekeeping. They are therefore less visible when only formal structures of gatekeeping, such as editorial boards, are examined (Cronin et al., 2004) and may play specific roles, e.g., they might be excellent mentors. Another cue that highly mentioned acknowledgees may play specific roles comes from the fact the number of citations seems not to be correlated with the number of mentions in the acknowledgments (Giles & Councill, 2004; Petrovich, forthcoming), suggesting that the role of highly mentioned researchers is different from that of their highly-cited counterparts.

Clearly, the framework of acknowledgments as positioning signals that it is proposed here does not answer all the questions about the role of acknowledgees in the social layer of science. However, compared with the standard normative account, it allows considering further interpretations besides the informal collaboration option and the related concept of helpfulness. From this point of view, it should be considered as a primer for a full-fledged theory of acknowledgments, rather than as a theory itself. At the same time, we hope it provides a rationale for engaging in the empirical study of the acknowledgments and, specifically, of the acknowledgments-based networks that are the topic of the present paper.

Formal definition of the acknowledgments-based networks

Before introducing the method of acknowledgments-based networks, it is useful to remember some basic notions of science mapping and network analysis on bibliographic data.

Networks extracted from academic publications, called bibliographic networks, are the basis of standard science maps (Batagelj & Cerinšek, 2013). Citation networks are generated considering a set of publications and the references they cite. In a citation network, the nodes represent publications and the links the citations that connect them. Co-authorship networks, on the other hand, are generated from the author lists of a set of publications. Their nodes represent authors and the links the co-writing of a publication. Citation networks are directed, as citations have a sender (the citing publication) and a receiver (the cited publication) and usually, are not weighted, as the relationship of citing between two publications can either exist or not, without intermediate degrees. Co-authorship networks, by contrast, are undirected networks, as the relationship of co-authorship is symmetric, and weighted, with the strength of the link between pairs of authors equals the number of papers co-authored by two authors.

Citation networks can be used to generate weighted undirected networks as well with the techniques of bibliographic coupling and co-citation analysis. In a bibliographic coupling network, the weight of the link between two publications is based on the number of references they share, i.e., on the number of publications that appear in both their bibliographies (Kessler, 1963). In a co-citation network, by contrast, the weight of the link is based on the number of publications that cite together each pair of publications, i.e., on the number of co-citations in the literature of each pair (Small, 1973). In both networks, the weight of the link between pairs of publications is usually interpreted as a measure of intellectual similarity, based on the idea that publications with overlapping bibliographies or that are frequently cited together in the literature are likely to be about similar research topics.

Drawing on the bibliographic networks presented above, it is possible to define three main networks based on the acknowledgments of academic publications. The first and fundamental network is the Publications \(\times\) Acknowledged Entities network (PEN). PEN is a two-mode or bipartite network comprising two types of nodes: publications on the one hand and the entities mentioned in the acknowledgments on the other hand. An edge is drawn between a publication and an acknowledged entity when the former mentions the latter in its acknowledgments. Clearly, no publication is connected with other publications and no acknowledged entity with other acknowledged entities.Footnote 5

Acknowledged entities are then furtherly partitioned into groups based on the category of the acknowledged entity. Funding agencies, congresses and conferences, institutions, and persons are thus differentiated, and the PEN is divided into the corresponding subnetworks Publications \(\times\) Funding Agency, Publications \(\times\) Conferences, and so on. Hence, the PEN network can formally be written as a graph \(\mathcal{N}=\left(\mathcal{P},\mathcal{E},\mathcal{L},r\right)\), where \(\mathcal{P}\) is the set of acknowledging papers, \(\mathcal{E}\) the set of acknowledged entities, \(\mathcal{L}\) the set of edges linking \(\mathcal{P}\) and \(\mathcal{E}\), and \(r:\mathcal{E}\to\mathcal{T}\) is a function that maps each acknowledged entity to an entity type in the set of entity types \(\mathcal{T}\).

Of particular interest to investigate the social network of scientific fields is the Publications \(\times\) Acknowledgees network (PAN), which focuses on the persons that are thanked in the acknowledgments. In the following, we will focus on this network but similar considerations hold for the other subnetworks of PEN.

An example of PAN is provided in Fig. 1, where four publications \({p}_{1},\dots , {p}_{4}\) are linked with five acknowledgees \({k}_{1},\dots ,{k}_{5}\). Publication \({p}_{1}\) mentions all five acknowledgees, whereas publication \({p}_{4}\) mentions none.

Fig. 1
figure 1

Relationships between the Publications \(\times\) Acknowledgees Network (PAN) and the derived networks ACN and AMN. The secondary acknowledgment coupling network (ACN) and acknowledgee co-mention network (AMN) are obtained by projecting the primary publication-acknowledgee network, respectively, on publication and acknowledgee nodes. The secondary networks are undirected, weighted networks, whereas the primary is a directed, unweighted network

PAN can be mathematically represented by the incidence matrix \(\mathbf{N}\), in which rows represent publications and columns acknowledgees. Let \(P\) and \(K\) denote, respectively, the number of publications and acknowledgees in PAN, which correspond to the number of rows and columns in \(\mathbf{N}\). When a publication in the \(p\)th row mentions an acknowledgee in the \(k\)th column, the corresponding element \({o}_{pk}\) in the matrix equals one, zero otherwise:

$$\left.\begin{array}{cccccc}&k_1 &k_2 &k_3 &k_4 &k_5\\ p_1& 1& 1& 1& 1&1\\ p_2& 1& 0& 1& 1&0\\ p_3& 1& 0& 0& 1& 0 \\ p_4 & 0 & 0 & 0 & 0 &0\end{array}\right.$$

Formally written as:

$$\mathbf{N}= \left[\begin{array}{ccccc}1& 1& 1& 1& 1\\ 1& 0& 1& 1& 0\\ 1& 0& 0& 1& 0\\ 0& 0& 0& 0& 0\end{array}\right]$$

N is a rectangle binary matrix (i.e., each element equals either zero or one) of order \(P\times K\) (in our example, 4 publications \(\times\) 5 acknowledgees). Note that the row total equals the total number of acknowledgees \({a}_{p}\) mentioned by publication p:

$${a}_{p}= \sum_{k=1}^{K}{o}_{pk}$$

whereas the column total equals the total number of mentions \({m}_{k}\) received by acknowledgee k, i.e., the total number of publications where she is mentioned:

$${m}_{k}= \sum_{p=1}^{P}{o}_{pk}$$

In PAN, \({a}_{p}\) and \({m}_{k}\) correspond to the out-degree of publication \(p\) and the in-degree of acknowledgee \(k\), respectively. Lastly, the sum of all the elements in N equals the number of edges in PAN, i.e., the total stock of mentions that is distributed among the population of acknowledgees. Clearly, the maximum number of mentions equals \(P\times K\), corresponding to the case in which each publication mentions every acknowledgee.

The second bibliographic network that can be generated from the acknowledgments is the Acknowledgee Co-Mention Network (AMN). Similarly to the co-authorship network, AMN is an undirected weighted one-mode network. Nodes in the AMN represent the acknowledgees mentioned in a set of publications. An edge between two acknowledgees is drawn when they are mentioned together in the acknowledgments of the same publication, i.e., when they are co-mentioned.Footnote 6 The weight of the edge equals the number of publications in which the two are mentioned together. AMN provides important insights on the structure of the social layer of a scientific or scholarly field, as groups of acknowledgees that appear frequently in publications are likely to belong to the same invisible colleges, i.e., to those basic social units where new ideas are developed and discussed. In the terms of the acknowledgments as positioning signals frameworks, frequently co-mentioned acknowledgees constitute a perceived social community within the social layer, i.e., the authors tend to associate these allies more frequently, suggesting that an alliance with the group is more valuable than an alliance with the individual acknowledgee. It is important to highlight that the co-mentioned acknowledgees are perceived by the community as forming an academic circle, intellectual school or group of allies. But it is possible that this community perception is not aligned with the perception of the acknowledgees themselves, that might have no mutual connection or social interaction the one with the other.Footnote 7

The third bibliographic network is the Acknowledgment Coupling Network (ACN). Again, ACN is an undirected weighted one-mode network. Differently from the AMN, however, in the ACN, nodes represent publications. An edge is drawn between two publications when they share at least one acknowledgee, i.e., when they both mention at least one common acknowledgee. The weight of the edge in ACN equals the number of shared acknowledgees between two publications. The weights in the ACN allows us to compare the social profiles of publications. Two publications sharing several acknowledgees are likely to have a similar social background, i.e., their authors are likely to be acquainted with the same group of commenters and collaborators. Therefore, the ACN and the AMN allow investigating the social layer of research fields from two complementary perspectives. Strictly speaking, the ACN is not a social network in the sense that its nodes represent human actors. However, the links between the nodes in the ACN are produced by humans, i.e., the authors of publications, to situate their position among other human actors, i.e., the acknowledgees. In the framework of acknowledgments as positioning signals, in fact, the acknowledgees mentioned in a publication are signals sent by the authors of the publication to their peers, i.e., social acts. The ACN can therefore be intended as a network that pertains to the social layer of science because it is the product of social acts, i.e., signals exchanged between the social actors precisely to specify their position in the social layer.

It is easy to see how the equivalents of ACN and AMN can be generated using different partitions of the nodes in the PEN. Focusing for instance on the Publications \(\times\) Funding Agencies sub-network, the funding coupling network is the equivalent of the ACN, and the co-mentioned funding agency network is the equivalent of the AMN.

Table 1 provides an overview of the four networks defined above with their main characteristics.

Table 1 The four acknowledgments-based network. The label, acronym, meaning of nodes and links, and type is presented for each network

Construction of ACN and AMN

From a mathematical point of view, the relations between PAN, ACN, and AMN can be expressed by simple matrix algebra operations.

The matrix \(\mathbf{P}\) representing the ACN can be obtained by the post-multiplication of PAN incidence matrix \(\mathbf{N}\) for its transpose:

$$\mathbf{P}=\mathbf{N}\cdot {\mathbf{N}}^{\mathbf{T}}=\left[\begin{array}{cccc}5& 3& 2& 0\\ 3& 3& 2& 0\\ 2& 2& 2& 0\\ 0& 0& 0& 0\end{array}\right]$$

P is a symmetrical, non-negative matrix of order \(P\times P\) (4 \(\times\) 4 in our example). Let \({c}_{ij}\) denote the element in the \(i\) th row and \(j\) th column of P. For \(i\ne j\) (i.e., for the off-diagonal elements), \({c}_{ij}\) denotes the number of acknowledgees shared by publications \(i\) and \(j\). For instance, publication \({p}_{1}\) and \({p}_{2}\) are connected in the ACN with linkage strength 3 because they both mention acknowledgees \({k}_{1}\), \({k}_{3}\) and \({k}_{4}\) in PAN. For \(i=j\) (i.e., for elements on the diagonal), \({c}_{ij}\) equals the total number of acknowledgees \({a}_{p}\) mentioned by publication p.

Similarly, the matrix \(\mathbf{K}\) representing the AMN results from pre-multiplication of \(\mathbf{N}\) for its transpose:

$$\mathbf{K}={\mathbf{N}}^{\mathbf{T}}\cdot \mathbf{N}= \left[\begin{array}{ccccc}3& 1& 2& 3& 1\\ 1& 1& 1& 1& 1\\ 2& 1& 2& 2& 1\\ 3& 1& 2& 3& 1\\ 1& 1& 1& 1& 1\end{array}\right]$$

Again, K is a symmetrical, non-negative matrix of order \(K\times K\) (5 \(\times\) 5 in our example). Off-diagonal elements denote the co-mentions of acknowledgees in the \(i\)th row and \(j\)th column of K, whereas diagonal elements equal the total mentions \({m}_{k}\) of acknowledgee k.Footnote 8 For instance, acknowledgees \({k}_{4}\) and \({k}_{3}\) share a link of strength 2 because they are co-mentioned in publications \({p}_{1}\) and \({p}_{2}\).

In network terms, these matrix algebra operations correspond to the one-mode projection of the two-mode PAN network on the acknowledgee nodes (AMN) and publication nodes (ACN), respectively (Newman, 2018).

Normalization of raw co-occurrence frequencies in ACN and AMN

The raw co-occurrences in P and K provide a first indication of the similarity between pairs of publications and pairs of acknowledgees, respectively. However, in scientometrics, it is recommended to not use raw co-occurrence data to measure similarities, as raw data are sensible to the size of the units of analysis, i.e., to their number of occurrences (van Eck & Waltman, 2009; Waltman & van Eck, 2007). To understand why, suppose that acknowledgee A receives ten times the mentions of acknowledgee B in a corpus of publications. Other things equal, then, acknowledgee A is likely to have about ten times as many co-mentions as acknowledgee B with other acknowledgees mentioned in the corpus. However, the fact that acknowledgee A has more co-mentions with other acknowledgees does not indicate that she is more similar to other acknowledgees than acknowledgee B. It only shows that she has more mentions, i.e., that she is “bigger” than acknowledgee B. To correct such distortions due to the different number of mentions, raw co-mentions should be adjusted by some stable quantity, i.e., they should be normalized. To perform this normalization, similarity measures are employed to allow rescaling of the raw co-occurrences in the range \([0, 1]\), where 0 indicates complete dissimilarity and 1 maximum similarity. Several similarity measures are available in the scientometric literature, and we refer the reader to Petrovich (2020) and van Eck and Waltman (2009) for further details on this topic. The main point for our discussion of acknowledgments-based networks is that normalization is needed to meaningfully compare the similarities between units with very different sizes, namely, acknowledgees with very different mentions and publications with acknowledgment sections of very different lengths.

Once similarities between each pair of publications or each pair of acknowledgees are computed, they are arranged in two square symmetrical similarity matrices \(\mathbf{P}\mathbf{^{\prime}}\) and \(\mathbf{K}\mathbf{^{\prime}}\). These data can then be analyzed using multivariate analysis techniques such as multi-dimensional scaling (MDS) or hierarchical clustering. Alternatively, the strength of links in ACN and AMN can be replaced with the corresponding similarities and the networks visualized with classic force-directed graph drawing algorithms.

Testing acknowledgements-based networks on analytic philosophy

To assess the method of acknowledgments-based networks, we tested it on a humanities field, namely analytic philosophy. The choice of this case study rests upon a threefold consideration.

First, previous studies of the journal Mind, a leading analytic philosophy journal, found a high acknowledgments intensity in the field (Cronin et al., 2003). During the Twentieth century, the share of articles in Mind featuring an acknowledgment has increased steadily from 7% at the beginning of the century to 94% in the 1990s. Acknowledgments seem then to be an established feature of analytic philosophy articles.Footnote 9

Second, the publications in the leading journals in analytic philosophy have been already mapped with citation-based science mapping techniques (Petrovich & Buonomo, 2018). Co-citation analysis, in particular, has proven to be successful in delineating the sub-disciplinary organization of the field. The structures in the co-citation network are recognizable by experts in the field, who can associate them with sub-areas of analytic philosophy. These maps can be hence compared with the maps of the social layer of the field generated with acknowledgments-based networks.

Third, analytic philosophy is taken here as a representative of humanities areas, at least from the viewpoint of authorship practices. Since multiple authorship is relatively uncommon in analytic philosophy like in other humanities fields, co-authorship analysis is likely to be scarcely useful to trace its social structures. At the same time, serials have become a common publication outlet for analytic philosophers (Levy, 2003). Therefore, focusing on journal articles allows gathering a representative sample of the field. From this point of view, collecting data from standard databases is easier for analytic philosophy than for other philosophical traditions whose communicative practices rely more on books or collections.

In the context of this study, however, analytic philosophy is taken mainly as a test for the methodology of acknowledgments-based networks. More detailed analyses focusing on the acknowledgments in analytic philosophy per se are presented in Petrovich (forthcoming).

Corpus, data extraction, and cleaning

The construction of the data needed for generating acknowledgments-based networks includes several steps.

First, a corpus of acknowledgment texts is collected from relevant publications. For analytic philosophy, we considered all the research articles published in five prestigious analytic philosophy journals (Philosophical Review, Journal of Philosophy, Mind, Noûs, and Philosophy and Phenomenological Research) between 2005 and 2019, resulting in a population of 2,073 publications.Footnote 10 The metadata of these articles (authors, titles, abstracts, and so on) were retrieved from Web of Science, along with their cited references, that were used to generate maps of the intellectual layer (see below). Acknowledgement texts were collected manually from the electronic versions of the articles and were associated with the article’s metadata in a database.

Then, the entities mentioned in each acknowledgment were extracted with the Named Entity Recognition (NER) module of spaCy (https://spacy.io/), a natural language processing package for Python. The NER algorithm is able to recognize named entities and classify them into several categories, such as persons, institutions, locations, numbers, and so on. This information is crucial to partition the nodes in the PEN into different groups of acknowledged entities. Since the entities classified as persons are the most frequent in our corpus and those needing less data cleaning, this study focuses on the acknowledgees mentioned in analytic philosophy articles, i.e., we consider the PAN, Publications \(\times\) Acknowledgees Network, rather than the wider PEN.

However, the raw output of the NER algorithm cannot be directly used to build the PAN, as it presents two main problems. First, it produces both false positives (entities classified as persons that are not persons) and false negatives (persons not recognized as such). These misattributions need to be manually corrected. Second, the names of the acknowledgees occur in several variants that must be standardized. For instance, diminutives of first names are common in the acknowledgments (e.g., “Tim” for “Timothy”, “Dave” for “David”) and sometimes composite surnames appear in several versions. Therefore, a considerable amount of data cleaning was done to obtain a cleaned and reliable dataset for the generation of acknowledgments-based networks. As we will discuss more extensively in the last section of the paper, the substantial data cleaning that is needed to correctly implement this methodology prevents an easy scaling up to massive datasets.

The cited references mentioned in the 2073 articles were cleaned as well using the software CRExplorer (Thor et al., 2016). CRExplorer algorithm, based on string similarity, is able to merge references that are likely to point to the same publication. We manually supervised the process to prevent incorrect merging and individuate further reference variants to standardize.

Lastly, the three networks PAN, AMN, and ACN were constructed using Pajek (Batagelj & Mrvar, 2004), and visualized with Gephi (Bastian et al., 2009) and VOSviewer (van Eck & Waltman, 2010).

As to AMN and ACN, the raw co-occurrences were normalized using the association strength (van Eck & Waltman, 2009), which is defined as:

$$A({c}_{ij}, {s}_{i}, {s}_{j})= \frac{{c}_{ij}}{{s}_{i}{s}_{j}}$$

where \({c}_{ij}\) is the co-occurrences of units \(i\) and \(j\), \({s}_{i}\) is the total number of occurrences of unit \(i\), and \({s}_{j}\) the total number of occurrences of item \(j\). Clearly, \(A\left({c}_{ij}, {s}_{i}, {s}_{j}\right)=[0, 1]\). The association strength is a probabilistic similarity measure that can be interpreted as a measure of deviation of observed co-occurrence frequencies of \(i\) and \(j\) from the co-occurrences frequencies that would be expected under the assumption that the occurrences of \(i\) and \(j\) are statistically independent (see van Eck & Waltman, 2009, p. 1647).

In the AMN, \({c}_{ij}\) corresponds to the number of publications co-mentioning the acknowledgees \(i\) and \(j\), whereas \({s}_{i}\) and \({s}_{j}\) correspond, respectively to the mentions \({m}_{i}\) and \({m}_{j}\) received by \(i\) and \(j\), so that the association strength measures the similarity between acknowledgees based on the number of co-mentioning publications. In the ACN, conversely, \({c}_{ij}\) corresponds to the number of acknowledgees shared by publications \(i\) and \(j\), whereas \({s}_{i}\) and \({s}_{j}\) correspond, respectively, to the number of acknowledgees \({a}_{i}\) and \({a}_{j}\) mentioned by publications \(i\) and \(j\). In this case, the association strength measures the similarity between publications based on the number of shared acknowledgees.

Results

Descriptive statistics of the dataset are displayed in Table 2. Most of the articles feature an acknowledgment (94%) and, of these, 95% mention at least one acknowledgee. Notably, the distribution of acknowledgee per paper is positively skewed, with few gigantic acknowledgments that pull the mean towards the right of the median.

Table 2 Descriptive statistics of the dataset

5,774 distinct acknowledgees are mentioned in the corpus, a population more than four times bigger than that of formal authors of the articles (= 1391). Most of the acknowledgees (60%) are mentioned only in one article. 14% receive 5 mentions or more and only 2% receive 20 mentions or more. The average acknowledgee receives 2.9 mentions (standard deviation = 5.1; median = 1). Mentions are therefore unequally distributed in the population. In terms of the PAN, this means that the distribution of acknowledgee nodes according to their degree is highly skewed, with relatively few acknowledgee nodes attracting most of the mentions from publication nodes.Footnote 11

A visualization of the PAN is provided in Fig. 2. The most mentioned acknowledgees (the big nodes in pink) and the articles with many acknowledgees (the big nodes in green) are clearly visible.

Fig. 2
figure 2

The PAN of analytic philosophy. Green nodes (24%) represent publications, pink nodes (76%) acknowledgees, edges the relationship of mentioning. The size of the nodes is proportional to their degree, i.e., number of acknowledgees for publications and number of mentions for acknowledgees. The network is laid out with the Yifan Hu algorithm in Gephi

Mapping the social layer of analytic philosophy: co-authorship network vs. ACN and AMN

The average number of co-authors per paper, which is slightly higher than 1, shows that co-authorship analysis cannot perform well on analytic philosophy. In fact, 70% of the unique authors in the corpus has no co-author, i.e., they are isolated in the co-authorship network, and 22% of them have only another co-author in the corpus. The resulting co-authorship network, shown in Fig. 3, is clearly too sparse to be informative. Empirical results hence confirm that co-authorship networks provide a poor understanding of the social layer of analytic philosophy.

Fig. 3
figure 3

Co-authorship network of analytic philosophy. Nodes represent distinct authors in the corpus (n = 1,391), edges co-authorship. The thickness of the edge is proportional to the number of co-authored articles. The network is laid out with the Kamada-Kawai algorithm in Pajek

The Acknowledgee Co-Mention Network (AMN), on the other hand, is not only bigger in size (5,774 versus 1,391 nodes) but also more interconnected.

To better understand the social structures within the AMN, we consider only the acknowledgees with 10 mentions or more in the corpus (n = 327). This allows to reduce the noise in the visualization and helps the interpretation of the map, which is shown in Fig. 4.Footnote 12 The VOSviewer algorithm for the detection of communities in networks individuates 6 communities of acknowledgees, that are represented by different colors in the map.

Fig. 4
figure 4

Acknowledgee Co-Mention Network (AMN) of analytic philosophy. Nodes represent acknowledgees mentioned by at least 10 articles, edges the co-mentioning of pairs of acknowledgees. Nodes’ size is proportional to the number of mentions, edges’ thickness to co-mentions (only strongest edges are shown). Nodes’ colors correspond to the communities individuated by VOSviewer community detection (resolution = 1.0). Nodes are placed in the space according to their mutual similarity so that closer nodes are frequently co-mentioned and distant nodes are scarcely co-mentioned

The Acknowledgment Coupling Network (ACN), on the other hand, shows the structure of the network of articles based on their social similarity, i.e., the number of shared acknowledgees (Fig. 5).Footnote 13 VOSviewer algorithm for community detection individuates 16 main communities.

Fig. 5
figure 5

Acknowledgment Coupling Network (ACN) of analytic philosophy. Nodes represent articles, edges the presence of shared acknowledgees between pairs of articles. Nodes’ size is proportional to the number of acknowledgees mentioned, edges’ thickness to the number of shared acknowledgees. Nodes’ colors correspond to the communities individuated by VOSviewer community detection algorithm (resolution = 1.0). Isolated articles are removed from the visualization. Nodes are placed in the space according to their mutual similarity so that closer nodes have many acknowledgees in common and distant nodes have few acknowledgees in common

AMN and ACN network statistics are provided in Table 3.

Table 3 AMN and ACN networks statistics

Intellectual structures vs. social structures in analytic philosophy

The ACN and the AMN, which trace the structure of the social layer of analytic philosophy, can be compared with two citation-based networks that map the intellectual layer of the field, namely the bibliographic coupling network (BCN) and the co-citation network (CCN). The BCN and the CCN are derived from the same citation network Publications \(\times\) Cited references by projection on the publication nodes and on the cited references nodes, respectively. As said in Sect. 3, in the BCN, the similarity of pairs of publications is assessed based on the number of references they share, whereas, in the CCN, the similarity of cited references is assessed based on the number of publications that co-cite them.

The comparison of BCN and ACN is particularly interesting, as these networks insist on the same type of nodes, namely publications.Footnote 14 Acknowledgee coupling and bibliographic coupling are in fact two alternative ways of measuring the similarity between papers: the former compares the social profile of two publications, the latter their intellectual profile. In this sense, it is interesting to study whether the profiles of publications, on average, converge or diverge, i.e., whether the structures found in the intellectual network of publications (BCN) have their counterpart in the social network of the same publications (ACN).

To do this, we compare the communities found in the ACN and in the BCN of analytic philosophy with the VOSviewer algorithm for community detection. Note that we consider only the articles that have both acknowledgees and cited references (n = 1804), and of these, only those that are not isolated in either network (n = 1762). With a constant resolution parameter = 1.0 for both networks, the algorithm individuates 16 main communities in the ACN and 7 main communities in the BCN.Footnote 15 The ACN seems therefore to individuate a more fine-grained structure in the set of articles compared to the BCN.

By examining the titles and abstracts of the publications in each cluster, the 7 communities can be easily associated by an expert to the main research areas in analytic philosophy (ethics and moral philosophy, logic and philosophy of language, philosophy of science, metaphysics, epistemology, philosophy of mind, and history of philosophy), confirming that bibliographic coupling is a powerful technique for reconstructing the intellectual organization of scholarly fields. A visualization of the BCN is provided in Fig. 6, where the various clusters are labeled with the corresponding philosophical area.

Fig. 6
figure 6

Bibliographic Coupling Network of analytic philosophy. Nodes represent articles, edges the presence of shared references between pairs of articles. Nodes’ size is proportional to their degree, edges’ thickness to the number of shared references (normalized with the association strength). Nodes’ colors correspond to the communities individuated by VOSviewer community detection algorithm (resolution = 1.0). Isolated articles are removed from the visualization

The partition into communities of the ACN, on the other hand, reveals a messier structure. In Fig. 7, the 16 clusters of ACN, represented by different colors, are overlaid on the BCN map, which is used as a reference framework.

Fig. 7
figure 7

ACN communities overlaid on the BCN network layout. In this visualization, the BCN network is used as a reference layer: nodes represent articles, edges shared references, and nodes’ relative position the bibliographic coupling strength between them. Nodes’ colors, however, are attributed based on the partitioning of the ACN network shown in Fig. 6

To better understand the relationship between intellectual and social communities, crossed frequencies between the two partitions are reported in Table 4. Communities based on intellectual similarity are on the rows and are identified by labels, communities based on social similarity are on the columns and are identified by letters.

Table 4 Contingency table between intellectual communities (rows) and social communities (columns) in analytic philosophy

A Chi-squared test of independence (\({\chi }^{2}\left(90, N=1762\right)=1935.9, p<0.000)\) rejects the null hypothesis that the two partitions are independent, i.e., that the distribution of papers among the two kinds of community is statistically random. Specifically, the Cramer’s V yields an effect size of 0.43, indicating a medium association strength between the two partitions.

The cross-frequencies reported in Table 4 show that papers are rather dispersed among different social communities. However, some social communities host most of the papers in certain areas: for instance, 218 out of 313 (70%) papers in Ethics belong to social community A. Since the two partitions can be treated as categorical variables, the association strength between each pair of intellectual and social communities can be measured using the Pearson residuals of the Chi-squared test (Agresti, 2007). Pearson residuals are defined as:

$$r= \frac{{f}_{o}-{f}_{e}}{\sqrt{{f}_{e}}}$$

where \({f}_{o}\) is the observed frequency and \({f}_{e}\) is the expected frequency under the null hypothesis of the Chi-squared test. Positive residuals indicate that the observed frequency is higher than expected, and hence that a positive correlation exists between the two variables. By contrast, negative residuals indicate that the observed frequency is lower than expected and, hence, that the two variables are negatively correlated. The association plot showing the Pearson residuals associated with each pair of communities is reported in Fig. 8 and it largely confirms the insights of the contingency table.

Fig. 8
figure 8

Association plot between intellectual communities (rows) and social communities (columns) of analytic philosophy. Positive residuals are in blue, with the ellipse right-oriented, negative residuals in red, with the ellipse left-oriented. Residuals close to zero are in white. The size of the ellipses is inversely proportional to the strength of association so that thinner ellipses indicate a higher association

Specifically, the small community of articles in history of philosophy (53 articles), dealing mostly with early modern philosophy, has a clear social counterpart in the community E, showing that history of philosophy forms a well-defined socio-intellectual cluster in analytic philosophy, relatively detached from other socio-intellectual clusters. As to the other research areas, the social partition of the ACN allows individuating how they are divided into multiple social communities. Epistemology is associated with communities G, H, and, mostly, I, whereas metaphysics is spread over communities F, G, and H. Interestingly, hence, social community H hosts papers that, from an intellectual point of view, belong to two distinct research areas. This may indicate that there is an underlying social cluster that crosses over metaphysics and epistemology. The overlap between specific social communities and the areas of ethics, philosophy of mind, and philosophy of science, on the other hand, is more pronounced, with ethics dividing into social clusters A and C (they are visible in Fig. 7 above as the two peninsulas dividing the cluster of ethics in the northern area of the map), and philosophy of science into B, F, and J. Interestingly, logic and philosophy of language, which were once considered as the core areas of analytic philosophy, are the more dispersed among the social communities. This weak association may indicate that authors doing research on logic and philosophy of language refer to informal collaborators that are distributed in the entire community of analytic philosophy rather than concentrated in specific social clusters.

Further insights on the relationship between the intellectual and social layers of analytic philosophy can be gained from the comparison of the AMN with the CCN. Differently from the ACN and the BCN, which insist on the same nodes, however, the AMN and the CCN lack a direct connection. In fact, the two networks host different types of entities: the AMN nodes represent persons, i.e., the acknowledgees mentioned in the acknowledgments of publications, whereas CCN nodes represent documents, i.e., the references cited in the bibliographies of publications. Rigorous statistical comparisons between the two networks are hence not possible. Nonetheless, a qualitative assessment of the similarities between their community structures is reasonably justified by the fact that they are both derived from the same corpus of articles.

The two networks share another common characteristic, namely their considerable size: the entire AMN contains 5,774 distinct acknowledgees and 109,391 co-mention links, whereas the CCN includes 39,668 distinct cited references and 1,675,502 co-citation links. To ease the interpretation of large-size bibliographic networks like these, in science mapping it is common to select only a subset of relevant nodes and their relations (Cobo et al., 2011). For co-citation networks, the number of citations is a common metric, based on the idea that the documents that occur most frequently in the bibliographies of articles constitute the knowledge base of a research field. Following previous research on co-citation networks of analytic philosophy, we retain in the CCN only the references with 20 citations or more in the corpus (n = 227) (Petrovich & Buonomo, 2018). The resulting sub-network is visualized in Fig. 9. VOSviewer algorithm individuates 5 clusters, that can be easily labelled with the corresponding sub-disciplines of analytic philosophy. Interestingly, the area of history of philosophy, which was visible in the BCN, does not appear in the CCN. This probably happens because the historical works surpassing the threshold of 20 citations are not enough to form an independent cluster.

Fig. 9
figure 9

Co-citation network (CCN) of analytic philosophy. Nodes represent cited documents, edges the presence of co-citation between pairs of cited documents. Nodes’ size is proportional to the citations received by the documents in the corpus of citing articles, edges’ thickness to the number of co-citations (normalized with the association strength). Nodes’ colors correspond to the communities individuated by VOSviewer community detection algorithm (resolution = 1.0). Only documents with 20 citations or more are included

The structure of the AMN, by contrast, is less clearly organized into sub-disciplines (see Fig. 4 above and Fig. 10). Some of the 6 communities individuated by the algorithm host in fact acknowledgees that are mainly specialized in certain areas. Cluster 4, for instance, include mainly philosophers working on metaphysics, cluster 5 philosophers of mind, cluster 2 philosophers specialized in ethics and moral philosophy, cluster 4 philosophers of language. The two remaining clusters, however, include philosophers with various specializations and cannot easily be associated with specific research areas. Moreover, epistemologists are found in all clusters. The structure of the AMN seems therefore to be shaped by both intellectual factors (the common research areas of acknowledgees) and social factors. In other terms, acknowledgees seem to be mentioned together even if they do not belong to the same research area. This may suggest that analytic philosophers ask the advice of informal collaborators not only within their specialized niches but in the broader analytic community as well.

Fig. 10
figure 10

AMN annotated. Clusters’ labels are attributed based on the philosophical specialization of acknowledgees. Note that only 4 clusters out of 6 are labelled

Also, the AMN is less visibly divided into clusters compared to the CCN, in which the intellectual domains are clearly separated. The visual impression is confirmed by the modularity statistics of the two networks: the modularity of AMN equals 0.24, whereas the modularity of CCN equals 0.36. Modularity measures how much a network is divided into sub-communities (Newman, 2018). Networks with high modularity have dense connections between the nodes within clusters but sparse connections between nodes in different clusters. The higher modularity of CCN shows that the intellectual layer of analytic philosophy is more differentiated than its social layer. The differentiation into philosophical sub-disciplines appears then to be more pronounced at the intellectual than at the social level.

Another important difference between the two networks is their temporal focus. The AMN captures the present social structure of analytic philosophy. All the acknowledgees that appear in it are active members of the analytic philosophy community. In this sense, the number of mentions they receive in the corpus reflects their accumulated prestige or symbolic capital in the current analytic community (Petrovich, forthcoming; see also Bourdieu, 1975; and Merton, 1988). The co-citation network, on the other hand, focuses on the past, as it captures the shared knowledge base of the discipline. However, several of the authors with a high number of citations, such as David Lewis, are no more active members of the analytic community. The fact that different philosophers appear in the two networks suggests that the distribution of prestige does not coincide with the distribution of intellectual capital in analytic philosophy. In this regard, there seems to be a temporal gap between the social and intellectual layers of analytic philosophy.

The two types of capital are also differently distributed in the community. The concentration of mentions and citations is shown by the two Lorenz curves in Fig. 11. The Lorenz Curve plots on the \(y\)-axis the cumulative proportion of the mentions (or citations) received by the bottom \(x\)-percent of the population of acknowledgees (or cited references, on the \(x\)-axis), with the \(x=y\) line representing perfect equality.Footnote 16 Figure 11 shows, for instance, that 60% of the acknowledgees receive only 20% of all the mentions available, whereas the same proportion of cited references receive around 30% of all the citations available.

Fig. 11
figure 11

Concentration of mentions and citations. The Lorenz Curve (red line) shows the cumulative proportion of mentions or citations received by the bottom x-percent of the population of acknowledgees or cited references, against the line of perfect equality (black line)

The distribution of mentions deviates from the line of perfect equality more significantly than the distribution of citations, i.e., mentions are less equally distributed than citations. The Gini coefficient can be used as a synthetic index of inequality to compare the two distributions. It is defined as the ratio of the area comprised between the equality line and the Lorenz curve over the total area under the equality line. It ranges between 0 and 1: a value of 0 expresses perfect equality, i.e., every member of the population has the same number of mentions (citations), whereas a value of 1 expresses maximal inequality, i.e., only one member collects all the available mentions (citations). In the case of mentions, the estimated Gini coefficient equals 0.54, whereas, in the case of citations, it equals 0.41. Therefore, symbolic capital, compared to intellectual capital, seems to be more concentrated in a smaller segment of the population.

The last analysis of the relation between intellectual and social structures in analytic philosophy is based on the combination of the structural information extracted from the BCN with the degree statistics of acknowledgees nodes in the PAN, i.e., with the number of mentions each acknowledgee receives in the corpus. Specifically, the research areas obtained from the partition of the BCN are used to break down the number of mentions each acknowledgees receives. In this way, it is possible to investigate the intensity and scope of informal influence that each acknowledgee has on the various areas of analytic philosophy. Table 5 presents the broken-down mention data for the top-15 acknowledgees, i.e., those with more than 40 mentions in the corpus.Footnote 17 Data show that the mentions of top acknowledgees are not equally distributed among research specialties but are concentrated in a few of them, depending on the main philosophical specializations of the acknowledgee. These data allow associating the acknowledgees with the research areas where their influence is higher: David Chalmers, for instance, is mostly associated with philosophy of mind (where he is mentioned in almost 1 article out of 5), metaphysics, and epistemology; Timothy Williamson with epistemology and metaphysics. Interestingly, some acknowledgees seem to extend their informal influence on numerous areas, whereas others are focused on a few sub-disciplines. If we consider research areas instead of acknowledgees, it is interesting to note that none of the top acknowledgees is mentioned in history of philosophy articles, confirming that this sub-discipline is peripheral of analytic philosophy mainstream, a result that emerged also from the comparison of the BCN with the ACN (see Fig. 8 above). Logic and philosophy of language, on the other hand, appear mostly as the second most associated area of top-acknowledgees, suggesting that these areas function as a common glue of analytic philosophy. Lastly, also the special role of epistemology is confirmed, as 7 out of 15 of the most mentioned acknowledgees receive most of their mentions from articles in this area.

Table 5 Mention decomposition for the top-15 acknowledgees

Conclusions

Science mapping has delivered in the last decades powerful tools for mapping the intellectual structure of scientific and scholarly fields, i.e., what has been called the semantic layer of knowledge systems. As to the social layer of science and scholarship, however, science mapping has relied mainly on co-authorship analysis, a methodology that is characterized by some limitations. In this study, we advanced a new method for mapping scientific and scholarly social networks that hinges on the analysis of the acknowledgments of academic publications. The method is based on four networks extracted from the acknowledgments: the Publications \(\times\) Acknowledged Entities network (PEN), the Publications \(\times\) Acknowledgees network (PAN), the Acknowledgment Coupling Network (ACN), and the Acknowledgee Co-Mention Network (AMN). Each of these networks sheds light on different facets of the social structure of a scientific or scholarly field. Analyzed in combination with the semantic networks (e.g., citation-based networks), they allow providing an in-depth picture of the socio-epistemic structures of a field. Moreover, the concept of acknowledgments as positioning signals permits gaining a nuanced understanding of the social structures emerging from acknowledgments analysis, highlighting that they include both informal collaboration ties and prestige dynamics, i.e., what we have called “networks of alliances”.

The framework allows to better specify why the acknowledgments-based network pertains to the social layer of science even if two of them, namely the PAN and the ACN, are not strictly speaking social networks, as they include nodes that are not human actors but artifacts, i.e., the publications. The key point is that all the acknowledgments-based networks reflect communicative acts between the human social actors in the social layer, whose aim is to situate the actors themselves among the other actors in the layer. In this sense, the links in all three networks reflect the social process of constructing a shared system of coordinates in the social layer.

The case study of analytic philosophy, on which we tested the proposed method, showed the type of results that can be obtained and the superiority of acknowledgments-based networks over co-authorship networks for areas where multi-authored publications are rare. At the same time, it highlighted two practical issues of the methodology that prevents an easy scale-up to massive datasets: data collection and data cleaning.

As to the former, the acknowledgments are only partially covered by Web of Science and the other multidisciplinary databases standardly used in scientometrics. In particular, Web of Science started to collect the acknowledgments only recently. For the Social Science Citation Index, these data are reliable from 2015, for the Arts & Humanities Citation Index from later on (Liu et al., 2020). Unfortunately, Web of Science records only the acknowledgments that mention external funding to allow funders to track the research they supported. In the humanities, however, these represent a small minority of the acknowledgments. Therefore, due to the current policies of the database, there is no alternative to the manual collection of acknowledgments from printed or electronic versions of articles, a very time-consuming operation.

The second major issue concerns the cleaning of the data. Once named entities are extracted with automatic processes, the raw output is characterized by several problems of disambiguation and variants merging. In particular, acknowledged entities such as universities and conferences, which are very interesting to reconstruct the institutional part of the social context, appear in many variants that need to be merged. Algorithms based on string similarity may help in this regard, but they always need human supervision to provide high-quality data. Also this task is labor-intensive and limits the scalability of the method.

Apart from technical developments that deal with these data-related issues, the research on acknowledgments-based networks may be extended in several directions. In particular, the understanding of the acknowledging behavior and, hence, of the meaning of PAN links, may be improved by analyzing the characteristics of the authors and the acknowledgees, i.e., by explicitly including the author-nodes in the PAN.Footnote 18 Do junior authors tend to mention senior scholars? Is there gender homophily in the network, i.e., do persons tend to acknowledge persons of the same gender? Do people coming from similar institutions or with the same educational background exchange acknowledgments more frequently? Another research line may focus on the temporal relationship between acknowledgments, co-authorships, and citations: are researchers that mention each other in their acknowledgments more likely to become co-authors in the future? Are mentions in the acknowledgments a predictor of citations or vice versa? Do acknowledgements grant a legitimacy bonus for papers that span across intellectual communities? What is the role of geographical connectivity and researchers’ mobility: do frequently mentioned acknowledgees hold high-ranked positions in eminent institutions where authors of academic papers might have spent some time?

Answering these questions may not only shed further light on the function of the acknowledgments, but also improve our understanding of the social dynamics of research fields, providing insightful material for empirical, data-driven social epistemology. From this point of view, acknowledgments-based networks are in line with recent attempts to integrate network analysis within philosophy of science (Herfeld & Doehne, 2019; Pence & Ramsey, 2018). Hopefully, a better understanding of the relation between the semantic, semiotic, and social layers of science will also contribute to fruitful exchanges between philosophy of science, history of science, sociology of science, and quantitative studies of science.