This paper investigates the data accumulation velocity of 12 Altmetric.com data sources. DOI created date recorded by Crossref and altmetric event posted date tracked by Altmetric.com are combined to reflect the altmetric data accumulation patterns over time and to compare the data accumulation velocity of various data sources through three proposed indicators, including Velocity Index, altmetric half-life, and altmetric time delay. Results show that altmetric data sources exhibit different data accumulation velocity. Some altmetric data sources have data accumulated very fast within the first few days after publication, such as Reddit, Twitter, News, Facebook, Google+, and Blogs. On the opposite spectrum, research outputs are at relatively slow pace in accruing data on some data sources, like Policy documents, Peer review, Q&A, Wikipedia, Video, and F1000Prime. Most altmetric data sources’ velocity degree also changes by document types, subject fields, and research topics. The type Review is slower in receiving altmetric mentions than Article, while Editorial Material and Letter are typically faster. In general, most altmetric data sources show higher velocity values in the fields of Physical Sciences and Engineering and Life and Earth Sciences. Within each field, there also exist some research topics that attract social attention faster than others.
“Speed” has been highlighted as one of the most important characteristics of altmetrics (Wouters and Costas 2012; Bornmann 2014). Compared to citations, which has been often criticized for its time delay in providing reliable measurement for research impact (Wang 2013), speed in the context of altmetrics is related to the idea that the impact of a given scientific output can be measured and analyzed much earlier (Priem et al. 2010; Mohammadi and Thelwall 2014). Publication delays are considered to substantially slow down the formal communication and dissemination of scientific knowledge (Amat 2008; Björk and Solomon 2013). In contrast, scholarly interactions on social media platforms are likely to happen within a very short time-frame. For instance, Twitter mentions of scientific documents may occur immediately within hours or even minutes after they were available online (Shuai et al. 2012; Haustein et al. 2015a).
However, because of the strong heterogeneity of altmetrics (Haustein 2016), which incorporate a wide range of metrics based on different types of data sources, it is difficult to establish a clear-cut and unified conceptual framework for the temporal analysis of all altmetrics. Each altmetric indicator, typically with unique functions and aimed at different audiences, may tell different stories about the reception of publications, and show distinguishing patterns in varying contexts. Lin and Fenner (2013) concluded that altmetrics are very likely representing very different things. From this point of view, we argue that the interpretation of the characteristic properties of different altmetrics should be made for each metric separately, including among these properties also their “speed”.
Accumulation patterns and immediacy measurement of citations and usage metrics
In contrast to altmetric data, the accumulation patterns of citations have already been widely discussed in previous studies from several perspectives, such as their “obsolescence” (Line 1993), “ageing” (Aversa 1985; Glänzel and Schoepflin 1995), “durability” (Costas et al. 2010), or “delayed recognition” (Garfield 1980; Min et al. 2016). Citation histories, which relate to the analysis of the distribution of citations over time, were mainly studied from the synchronous or diachronous perspectives (Stinson and Lancaster 1987). The former considers the distribution of the publication years of cited references, while the latter focuses on the distribution of received citations over time (Colavizza and Franceschet 2016; Sun et al. 2016), which are also referred to as “retrospective citations” and “prospective citations”, respectively (Glänzel 2004). These two approaches have been applied to studying the accumulation patterns of usage metric data as well. With the development of digital publishing, usage metrics have been proposed and adopted by publishers during the last decades to supplement citations in reflecting how frequently scientific outputs are used and measuring their early impact to some extent (Schloegl and Gorraiz 2011). From the synchronous perspective, Kurtz et al. (2005) concluded that most studies of obsolescence found that the use of literature declines exponentially with age. The diachronous accumulation patterns of usage metrics, like views, downloads, reads, etc., were investigated and often compared with citations. On the basis of page views data of Nature publications, Wang et al. (2014) explored the dynamic usage history over time and found that papers are used most frequently within a short period after publication, finding that in median it only takes 7 days for papers to reach half of their total page views. Schlögl et al. (2014) reported that citations take several years until they reach their peak, however most downloads of papers are quickly accrued in the same publication year. In a similar fashion, Moed (2005) already found that citations and downloads show different patterns of obsolescence, and about 40% of downloads accumulated within the first 6 months after publication. More recently Wang et al. (2016a) using the article-level “usage counts” provided by Web of Science to investigate the usage patterns of indexed papers, identified that newly published papers accumulated more Web of Science usage counts than older papers.
As to the measurement of the “speed” of citations and usage metrics, several indicators have been created and applied in practice. For example, based on the time elapsed between the publication date and the date of the first citation of a paper, Schubert and Glänzel (1986) developed the indicator mean response time (MRT) in order to measure the citation speed of journals, understood as the properly formed average number of years between the publication of articles in a journal and the time of their first citation. In order to measure how quickly articles in a journal are cited, the Journal Citation Reports (JCR) calculates the indicator named Immediacy Index for each journal in each year. This indicator is defined as the average number of times an article is cited in the same year it is published.Footnote 1 Besides, at the journal level, Cited Half-Life and Citing Half-Life are also calculated by JCR to measure how fast journals are accumulating half of their citations and how far back that citing relationship extends.Footnote 2 Analogous to the citation-based Immediacy Index and half-life, the “usage immediacy index” and “usage half-life” (Rowlands and Nicholas 2007), “download immediacy index” (Wan et al. 2010) were proposed to describe the life cycle of usage metrics. By analyzing usage data in the field of oncology collected from Science Direct, Schloegl and Gorraiz (2010) calculated the mean usage half-life and found that it is much shorter than the average cited half-life, observing also different obsolescence patterns between downloads and citations.
Accumulation patterns and immediacy measurement of altmetric data
Since the emergence of altmetrics, most related studies have focused on the coverage of publications across altmetric sources and their correlation with citation counts (Thelwall et al. 2013; Haustein et al. 2014; Costas et al. 2015a). Less attention was paid to the study of the accumulation velocity of altmetric data over time. Only a few altmetric data sources were investigated from the perspective of their immediacy. Maflahi and Thelwall (2018) conducted a longitudinal weekly study of the Mendeley readers of articles in six library and information science journals and found that they start to accrue early from when articles are first available online and continue to steadily build over time, being this the case even for journals with large publication delays. Thelwall (2017) also found that articles attracted between 0.1 and 0.8 Mendeley readers on average in the month they first appeared in Scopus, with some variability across subject fields. The results based on PeerJ social referrals data of Wang et al. (2016b) suggested that the number of “visits” to papers from social media (Twitter and Facebook) accumulates very quickly after publication. By comparing the temporal patterns of Twitter mentions and downloads of arXiv papers, Shuai et al. (2012) found that Twitter mentions have shorter delays and narrower time spans than arXiv downloads. Ortega (2018) made a comparison of temporal distribution at the month time interval among citations, views, downloads, Mendeley readership, tweets, and blog mentions recorded by PlumX, and concluded that tweets and blog mentions are the quickest available metrics. Yu et al. (2017) found that Twitter and Weibo are more immediate than citations, however they also suggested that not all altmetric data sources have the same degree of immediacy.
In contrast to citation histories, which are mainly analyzed at year or month levels, for altmetrics it is insufficient to use such large time aggregations, since the real-time update of social media metric data makes altmetric events around research outputs visible within smaller time scales (e.g. hours or days). Nevertheless, a large-scale quantitative analysis comparing the data accumulation patterns of different altmetric data sources at the micro-level time interval (i.e. day) is still missing in the literature in altmetrics, probably caused by the absence of a reliable and precise proxy for publication dates, a piece of information that is critical in order to study the accumulation patterns of altmetric data (Haustein et al. 2015a). Crossref provides several publication dates for its recorded DOIs, such as DOI created date (date on which the DOI was first registered), published-online date (date on which the work was published online); published-print date (date on which the work was published in print), etc. The distribution and potential of these date information for altmetrics have been compared and analyzed in a previous study (Fang and Costas 2018), as suggested by Haustein et al. (2015a), the value of DOI created date as a fine-grained benchmark of publication date in the context of altmetrics was highlighted.
In this paper, on the basis of DOI created date recorded by Crossref, as well as the altmetric event posted dateFootnote 3 recorded by Altmetric.com, we compare the accumulation velocity amongst different types of altmetric data from a diachronous perspective.
The main objectives of this study are:
to measure the accumulation velocity of altmetric data of scientific publications on 12 Altmetric.com data sources, here velocity referring to the pace at which altmetric events accumulate over time, and
to compare altmetric data accumulation velocity of different altmetric data sources across document types, subject fields, and research topics.
The specific research questions are as follows:
How are the altmetric data accumulation patterns of various Altmetric.com data sources?
On which data sources do newly published research outputs show higher velocity in accruing altmetric data (and which ones are relatively lower)?
How do the data accumulation velocity of different Altmetric.com data sources vary across document types, subject fields, and research topics?
Data and methods
Altmetric.com data sources with altmetric event posted date
In this study altmetric event records of 12 Altmetric.com data sources with posted date are selected as research objects. The altmetric data for this study were provided by Altmetric.com in a dump file with their data until October 2017. Table 1 presents these 12 data sources with event posted date information tracked by Altmetric.com together with the date when they started their coverage.
Considering the posted dates of the different altmetric events, we could know the exact date on which an altmetric event was posted. In addition, in order to study the accumulation patterns of altmetric data at the day time interval, DOI created dates of research outputs recorded by Crossref are collected to serve as the proxy of publication dates. To obtain both altmetric event posted date and DOI created date for measuring accumulation velocity, Web of Science (WoS) publications with the following criteria were selected as research objects:
Publications with DOI recorded by Crossref. In order to get the DOI created dates, selected publications must have DOIs recorded by Crossref.
Publications with publication date ranging from 2012 to 2016 according to both WoS publication year and Crossref DOI created date. To filter out old publications with newly registered DOIs (Fang and Costas 2018), WoS publication year is also used as a benchmark to restrict the publication year of samples.
Publications with at least one altmetric event recorded from any altmetric data source listed in Table 1.
Publications without arXiv preprint version tracked by Altmetric.com. The existence of preprint version makes research outputs available to social media before they are formally published (Darling et al. 2013), which may lead to the altmetric record posted dates to be earlier than the publication date. Therefore, publications with arXiv IDs tracked by Altmetric.com are not included in this study.
According to the above criteria, there are 2,597,339 publications extracted from the CWTS in-house WoS database. However, 204,387 of them (accounting for 7.9%) have at least one altmetric event posted date earlier than their DOI created dates. Except for the influence of preprint versions, in theory an altmetric event cannot mention a DOI before it exists. The possible reasons for the existence of these unreliable cases are the following:
Crossref DOI created dates may contain errors and not always accurately reflecting the publication date.
Publications’ DOI created dates may be updated by publishers due to different reasons (e.g. publisher mergers).Footnote 4
In order to ensure the highest precision in our analysis, publications with any altmetric event posted date before their DOI created date are excluded from our analysis, resulting in a total set of 2,392,952 publications that are finally analyzed in this study. Table 2 lists the number of publications mentioned by each data source and the total number of altmetric events they have accumulated in the dataset. Twitter contributes the most majority of altmetric data to selected publications, followed by Facebook.
Indicators and analytical approaches
Considering the diverse nature, scale, and user types of different altmetric data sources, it is very likely that they exhibit also very different velocity degrees of accumulation in face of newly published research outputs. To reflect the velocity differences among altmetric data sources, we use three indicators to measure velocity from both flexible and fixed perspectives, including Velocity Index, altmetric half-life, and altmetric time delay.
For altmetric data accumulated on a specific data source, the Velocity Index (VI) refers to the proportion of altmetric events that happened in a specific time interval (e.g. 1 day, 1 month, 1 year, etc.) after the publication of the papers. The calculation method is shown in the formula below.
Pi is the number of events accrued in a specific time interval after publication (e.g. 1 day, 1 month, 1 year, etc.) for a set of publications, TPi indicates the total number of events during the observed time window. In general, the closer to 1 of the Velocity Index, the more immediate (faster) the altmetric data of new publications accumulated in the given observation period. Conversely, the closer to 0, the lower the accumulation velocity (i.e. more events happened beyond the specified period of time).
Besides, in line with the Twitter half-life and Twitter time delay proposed by Haustein (2019), which refer to the number of days until 50% of all tweets have appeared and the number of days between publication of a document and its first tweet, respectively, we generalize these indicators for all altmetric data sources. Consequently, the altmetric half-life of an altmetric data source is defined as the number of days until half of its events have appeared, and altmetric time delay of a research output on an altmetric data source is defined as the number of days between its publication and its first altmetric event on that data source.
Both Velocity Index and altmetric half-life are based on overall data distribution of all events received by a publication, while altmetric time delay focuses on a special altmetric event (the first one). Velocity Index provides a flexible perspective for the measurement of data accumulation velocity, since it allows for more nuanced time accumulation discussions considering different time intervals (i.e. days, months, years). By comparison, altmetric half-life and altmetric time delay provide a fixed perspective at the day level. Therefore, these indicators work as relevant complements to each other in order to better characterize the tempo of altmetric data accumulation.
In addition, the Spearman correlation analysis is performed with IBM SPSS Statistics 25 to explore the relationships among Velocity Index, altmetric half-life, and altmetric time delay. Also, at the research topic level, in order to testify whether or not research topics with fewer publications and altmetric events are more likely to reach higher values of Velocity Index, the Spearman correlation analysis is applied to exhibit the relationships among number of publications, number of altmetric events, and the Velocity Index.
CWTS publication-level classification system
The CWTS classification is a publication-level subject field classification system developed by Waltman and Van Eck (2012). It has not only been applied in Leiden Ranking (https://www.leidenranking.com/), but also employed by many previous studies for subject field related analysis (Costas et al. 2015a; Didegah and Thelwall 2018). In the 2019 version of the publication-level classification, only citable items (Article, Review, and Letter) indexed by Web of Science are clustered into 4535 micro-level fields. These micro-fields correspond to small research topics (micro-topics), and they are assigned to five main subject fields of science algorithmically obtained, including Social Sciences and Humanities (SSH), Biomedical and Health Sciences (BHS), Physical Sciences and Engineering (PES), Life and Earth Sciences (LES), and Mathematics and Computer Science (MCS),Footnote 5 which are illustrated in Fig. 1 with VOSviewer. The layout of Fig. 1 is also used to exhibit the Velocity Index of each micro-topic in the Result section. For the selected publications in our dataset, 2,189,708 of them (accounting for 91.5%) have CWTS classification information. This set of publications is drawn as our final sample of publications for the comparison of altmetric data accumulation velocity across subject fields and research topics. Statistics on the general presence of different altmetric data across five main subject fields can be found in Appendix Table 4.
Altmetric data accumulation patterns
The intervals between publication dates and altmetric events posted dates are calculated for all altmetric events on each data source. Thus we can investigate the altmetric data accumulation patterns at the day time interval. Figure 2 shows the different data accumulation patterns of the 12 data sources within 1-year time interval (365 days) after publication. Data sources show different data accumulation patterns. Altmetric events to newly published research outputs on some data sources accumulated very fast, such as Reddit and Twitter, since half of their data accrued in the first 2 weeks (14 days) after the research outputs were published, and over 85% of their data happened within a year (365 days). Following Twitter and Reddit we have other pretty fast altmetric data sources including News, Google+, Facebook, and Blogs. In contrast, Policy documents, Wikipedia, Q&A, and Peer review show much slower data accumulation patterns. Only 21.5% of Policy document citations, 31.9% of Peer review comments, 39.4% of Wikipedia citations, and 40.6% of Q&A mentions are accumulated within 1 year, which means that most of the events from these data sources happened more than a year after publication. Among these data sources, F1000Prime presents some uniqueness. In the first month after research outputs are published, the accumulation of F1000Prime recommendations is not very fast, but it speeds up over time, with more than 84% of data accrued within the first year.
The dashed line at accumulative percentage of 50% in Fig. 2 indicates the altmetric half-life, and Table 3 lists the altmetric half-lives of the 12 data sources analyzed. Reddit ranks first, with a half-life of 7 days, followed by Twitter (13 days), News (22 days), Google+ (25 days), and Facebook (30 days). Over half of altmetric events on these data sources happened within 1 month after the publication of research outputs. Other sources such as Wikipedia, Peer review, and Policy documents, need over 500 days to accumulate half of their event data. On the one hand, these data sources show lower reaction speed towards newly published publications. On the other hand, it suggests that they also pay more attention to publications with older publication time.
Generalizing the Velocity Index and altmetric time delay
The Velocity Indexes of each Altmetric.com data source at the day, month, and year time intervals are calculated respectively, and the rankings of sources by their Velocity Index are shown in Fig. 3. The rankings vary at different time intervals. Reddit, Twitter, and News are the data sources showing the most immediate data accumulation patterns at the day, month, and year time intervals. Followed by Facebook, Google+, and Blogs. While Policy documents, Peer review, Wikipedia, Q&A, and Video perform more slowly in their Velocity Index values. F1000Prime, as mentioned above, although one of the slowest data sources at the day time interval, ranks the third at the year time interval. This means that the accumulation of F1000Prime recommendations of newly published research outputs is relatively slow in the short term, but it is faster at the year time interval (see also Fig. 2). The case of F1000Prime highlights the importance of considering together the altmetric half-life of data sources and their Velocity Index, since both bring two different perspectives about the tempo of altmetric data.
Besides the Velocity Index and altmetric half-life which are based on overall altmetric data of each data source, we also consider the time delay of publications until they accrued their first altmetric event from different data sources, in which case only one specific altmetric event of publications is considered. The number of days between being published and being mentioned for the first time on a certain data source is calculated for each publication, and the distribution of altmetric time delays of the 12 Altmetric.com data sources is plotted in Fig. 4. Each curve shows, for each specific data source, the proportion of publications that accrued the first altmetric event beyond certain number of days since being published. For instance, only about 37% of publications received their first Twitter mentions after the 10th day after being published (the vertical dashed line in Fig. 4), while 94% of publications received their first Wikipedia citations after the 10th day after publication. In other words, around 63% of publications obtained their first Twitter mentions within 10 days after publication, and only 6% of publications got the Wikipedia citations within the same time period. The more skewed the curve, the higher the proportion of publications accrued their first altmetric event after a long time. As a result, publications are faster to be visible on Twitter compared to other data sources, followed by Reddit, Google+, and Facebook. For various altmetric data sources, the patterns of accumulating the first altmetric event are quite similar with their Velocity Indexes at the month time interval and altmetric half-lives (Appendix Table 5 provides the spearman correlations for the rankings based on these three indicators).
Overall, Twitter, Reddit, Google+, Facebook, News, and Blogs can be categorized as fast sources, while in general, F1000Prime, Video, Wikipedia, Q&A, Peer review, and Policy documents show lower velocity in mentioning research publications. These six data sources can be classified as slow sources.
Velocity Index variations across document types
For different document types, their altmetric data accumulation velocity might show some differences. So we utilize the Velocity Index at the month time interval to measure the altmetric data accumulation velocity for different document types across diverse data sources. The differences in the Velocity Index across the four main document types with most number of publications: Article (N = 1,951,197, Coverage = 81.5%), Review (N = 196,722, Coverage = 8.2%), Editorial Material (N = 139,950, Coverage = 5.8%), and Letter (N = 52,038, Coverage = 2.2%), are illustrated in Fig. 5. Presence of altmetric data across these four document types is listed in Appendix Table 6. The type of Article is the largest in number of publications, so its Velocity Index is very close to the overall Velocity Index of each data source. Review, Editorial Material, and Letter, in comparison, show differences with the overall Velocity Index, especially for data sources with relatively high Velocity Index values. Reviews are not as fast in accumulating altmetric data as compared to the other document types. Conversely, Editorial Material and Letter are document types more likely to be mentioned faster after publication. The Velocity Indexes of these two document types are higher than the overall Velocity Index for most data sources. In particular, Editorial Material and Letter hold relatively high Velocity Indexes on Peer review platforms (Publons and PubPeer), which is among the group of “slower” data sources based on the overall Velocity Index (Fig. 3) and its altmetric half-life (Table 3). The Review type also has a slightly higher Velocity Index than the overall and Article type on Peer review events. Results show that Peer review platforms seem to notice and comment on Editorial Materials, Letters and Reviews more quickly than regular Articles. Although the coverage of these three document types with Peer review data is limited (0.20–0.27%), there are larger shares of Peer review comments that happened soon after their publication compared to other altmetric events of slow sources.
Velocity Index variations across subject fields
The coverage of publications in Altmetric.com from different data sources differs by subject fields (Zahedi et al. 2014; Costas et al. 2015b). In this study (Fig. 6) we analyze the changes in the Velocity Index at the month time interval of different Altmetric.com data sources across five major subject fields of science (using the CWTS classification). Each row presents the Velocity Indexes of different altmetric data sources ranked from high to low in each subject field. Each altmetric data source in Fig. 6 is indicated with the same color, together with their specific Velocity Index. On the top of Fig. 6, altmetric data sources are ranked by their overall Velocity Indexes at the month time interval. Colorful lines between two Velocity Indexes in the same color display the rank changes for the same data source across subject fields. According to these results, Twitter and Reddit are the most immediate data sources to newly published research outputs in all subject fields. By subject fields, the overall Velocity Indexes of all altmetric sources in Physical Sciences and Engineering and Life and Earth Sciences are the highest. Facebook shows the higher immediacy degree in the fields of Social Sciences and Humanities and Mathematics and Computer Science, although overall, the Velocity Index values of these subject fields are comparatively low. Conversely, News has relatively high Velocity Index in the fields of Physical Sciences and Engineering, Life and Earth Sciences, and Biomedical and Health Sciences, while it is slower in Social Sciences and Humanities. As to other data sources, they keep quite steady medium or low Velocity Indexes in all subject fields. For example, Policy documents, Peer review, and Q&A have the lowest Velocity Indexes across most subject fields, suggesting that these data sources are comparatively less focused on more recent publications as compared to the other sources regardless the subject fields of the publications.
From the perspective of altmetric time delay, Fig. 7 shows the distribution of altmetric time delay across five main subject fields for 12 Altmetric.com data sources respectively. For most data sources, although to different degrees, publications in the fields of Physical Sciences and Engineering (PSE) and Life and Earth Sciences (LES) are faster to receive their first altmetric mention. In contrast, it took more days for publications in the fields of Social Sciences and Humanities (SSH) and Mathematics and Computer Science (MCS) to accumulate the first altmetric event record. Altmetric time delays of publications in Biomedical and Health Sciences (BHS) are in the middle on most data sources. Still, the accumulation velocity across subject fields in terms of altmetric time delay is similar with the results observed through the lens of Velocity Index.
Velocity Index variations across research topics
Considering the Velocity Index at the month time interval, we further investigate the variations across research topics to study which topics accumulated altmetric data faster than others. Twitter and Wikipedia are selected as two representatives for fast sources and slow sources because they hold the largest data volume among their same types of data sources. Velocity Indexes are calculated for publications within each micro-level field sharing the similar research micro-topics based on Twitter mention data (Fig. 8) and Wikipedia citation data (Fig. 9). In both Figs. 8 and 9, size of each circle is determined by the number of publications with Twitter mention/Wikipedia citation data in this micro-level field, while color is determined by the Velocity Index at the month time interval. Within micro-level fields, number of publications and number of altmetric events are very weakly correlated with the Velocity Index values based on Twitter data, and are moderately and positively correlated with those based on Wikipedia data (Appendix Table 7), indicating that not all of micro-level fields with fewer publications are more likely to reach high Velocity Index, and vice versa. Some prominent research micro-topics with relatively high Velocity Index values in every main subject field are highlighted with annotation texts.
From the point of view of Twitter data, are the research micro-topics in the fields of Physical Sciences and Engineering the ones exhibiting the highest Velocity Index values in contrast to the other fields, which is in correspondence with the above observations. Within the other subject fields, there are some research micro-topics that show quite high Twitter mention accumulation velocity as well. For example, “wireless power transfer” and “compressive sensing” in Mathematics and Computer Science accumulated the majority of their Twitter mentions in a short time, as well as “dinosauria” and “internal tide” in Life and Earth Sciences. In the fields of Biomedical and Health Sciences and Social Sciences and Humanities, “DNA vaccine”, “spiking neuron”, “response inhibition”, and “rock art” drew attention on Twitter relatively fast too.
Compared to Twitter mentions, the overall accumulation velocity of Wikipedia citations is much lower, and the difference among main subject fields is not as obvious as Twitter. However, there also exist some research micro-topics showing higher data accumulation velocity. For instance, “dinosauria” and “trilobita” in Life and Earth Sciences are two micro-topics faster in Wikipedia. Publications about these two topics received more Wikipedia citations in a short time period compared to the others. Similarly, “ecstasy” (caused by drugs), “muscle synergy”, “warning Goldbach problem” and some other research micro-topics accumulate Wikipedia citations also relatively fast. In the field of Social Sciences and Humanities, although most research micro-topics were quite slow to be cited by Wikipedia, some environmental protection related micro-topics, such as “ecocriticism” and “resource curse”, show higher Velocity Index values.
Speed has always been assumed as a characteristic property of altmetrics, however not much research has been done in characterizing the accumulation velocity of different altmetric data at a large scale. This study fills this gap by describing the immediacy of altmetric data accrued after the publication of research outputs. Using the DOI created date and altmetric event posted date enables the possibility of studying the altmetric data accumulation patterns at the day level. The date when a DOI was assigned to a publication provided by Crossref has already been used to show the life cycle of some altmetric events at the month level by Ortega (2018). This study investigates further on the accumulation velocity of various altmetric data at a more micro-level time interval and considering a larger scale of data samples.
As observed by Sun et al. (2016), citation histories typically show a pattern of just a few citations accrued within the first few years after publication, reaching a citation peak after 3–4 years, and then a decrease afterwards. Yet most kinds of altmetric data exhibit a different accumulation pattern compared with citations. We found that the accumulation velocity of different altmetric data vary substantially across data sources, document types, and subject fields.
Variations across altmetric data sources
It is demonstrated that various altmetric data sources vary in their data accumulation patterns, and the property of speed is not found to be owned by all of altmetric data sources. Some of the altmetric data sources accrue a considerable proportion of events very soon after the publication date of scientific outputs. Among these outputs we have Reddit, Twitter, News, Facebook, Google+, and Blogs. All these altmetric data sources exhibit short altmetric half-lives, short altmetric time delays and relatively high Velocity Indexes. Therefore, it can be argued that their velocity aligns with the property of speed that altmetrics are expected to have, being possible to label these as fast sources. However, for Policy documents, Q&A, Peer review, Wikipedia, Video, and F1000Prime events, only a very limited share of these altmetric events happened within a short time after publication, being these slow sources. The data accumulation velocity of some slow sources are similar to that of citations, with important delayed patterns after publication. For example, based on our dataset, half of Policy document citations happened after 716 days since publication. Older publications, however, seem also to still be attractive for these slow data sources, so that their attention is not concentrated on just newly published research outputs. As a whole, social media platforms and mainstream media are more immediate in sharing, discussing, and reporting new research outputs.
Interestingly, different time windows may also show different sources as being fast or slow. For example, although F1000Prime is seen as a slow source in the short term (e.g. day or month level), it is one of the sources that accumulated the largest share of its events within 1 year. This reinforces the importance of combining different perspectives (e.g. different indicators, different time windows) to study the tempo of altmetrics to provide the most complete picture.
As a result, assumptions about the “speed” of types of events classified under the umbrella term “altmetrics” should be taken with particular caution. Not all of them are fast sources, and not all of them have the same accumulation pace. Thus, it is important to take the social media landscape in which these events are produced into consideration (Alperin 2015). Once again, caution about the merging of altmetric sources in compound metrics or global indicators must be observed, particularly considering that time affects differently to different sources. Keeping altmetric events separate seems to be an important recommendation, this given not only their fundamental differences (Haustein et al. 2016; Wouters et al. 2019) but also their time accumulation patterns as demonstrated in this study. Moreover, the pace and tempo of different altmetrics cannot be seen as equivalent and, similar to what happens with citations, these time differences need to be taken into account when considering different time windows in altmetric research.
Variations across document types
Zahedi et al. (2014) concluded that the coverage of several altmetric data sources varies across document types and subject fields. In this study, it is shown that the same type of variations apply also to the data accumulation velocity of different altmetric data sources. Thus, in terms of document types, Reviews (this document type mainly focuses on retrospectively reviewing existing findings) are overall the slowest in accumulating altmetric events. A possible reason for this slowest reception lies in the less innovative nature of Reviews. In other words, Review papers are less prone to provide new research discoveries and more to condense the state-of-the-art in a subject field or research topic, therefore lacking the novelty component of other document types. For example, the research topics presented in Editorial Materials and Letters may be more likely to evoke social buzz immediately, since they cover more novel topics, debates, scientific news, etc., without using a too complicated and technical language (Haustein et al. 2015). The thematic property of these two document types might facilitate the users’ attention received more immediately, particularly on Peer review platforms, a type of altmetric data source which is mainly used by researchers, who are faster to take notice of controversial topics emerging in the scientific community. This finding is quite similar with the ageing patterns of citations to different document types: Editorial Materials and Letters were found more likely to be the “early rise-rapid decline” papers with most citations accumulated in a relatively short time period, while Review was observed to be the delayed document type with a slower growth (Costas et al. 2010; Wang 2013).
Variations across scientific fields and topics
In terms of scientific fields, research outputs from the fields of Physical Sciences and Engineering and Life and Earth Sciences are more attractive to social media audiences shortly after publication, accruing altmetric events faster compared to other fields. Research outputs from the fields of both Social Sciences and Humanities and Mathematics and Computer Science are relatively slower to be disseminated on altmetric data sources, although publications in these two fields hold different altmetric data coverage, with the former much higher than the latter (Costas et al. 2015a). Such field-related data accumulation dynamics was also observed in the context of citations, for instance, citation ageing in the social sciences and mathematics journals is similarly slower than in the medical and chemistry journals (Glänzel and Schoepflin 1995), the physical, chemical, and earth sciences, fields in which the research fronts are fast-moving, have more papers showing rapidly declining citation pattern (Aksnes 2003). From the perspective of first-citation speed, papers in the field of physics are faster in receiving the first citation, followed by biological, biomedical, and chemical research, while mathematics papers show lower first-citation speed (Abramo et al. 2011). Even though the overall accumulation patterns between citation data and most altmetric data are obviously different, they share very similar tempos across scientific fields.
Furthermore, the variations do not only exist at the main subject field level, but also the research topic level. Within each subject field, different research topics also show various velocity patterns in receiving altmetric attention, both on fast sources or slow sources. This signifies the thematic dependency of users in following up-to-date research outputs around some topics, just like some certain research topics drive more social attention over others (Robinson-Garcia et al. 2019). Thus, further research should focus on identifying the main distinctive patterns of publications and research topics to determine their faster/slower reception across altmetric sources, and how different observation time windows, and the selection of different data sources, may affect real-time assessment in altmetric practice.
The main limitation of this study lies in the precision of Crossref’s DOI created date as the proxy of actual publication date of research outputs. There might still be a small distance between the date on which a DOI was created and the research output was actually made publicly available, which could result in some inaccuracies in our results. Besides, as we mentioned in the data part, DOI created dates might be updated due to the change of DOI status, thereby causing the unreliable time intervals. One of the effects of these inaccuracies is that some publications may have altmetric event posted date even earlier than DOI created dates. Therefore, publications with such unexpected time intervals have been excluded from this study to lower the negative influence made by questionable DOI created dates. Future research should focus on refining accurate methods of identifying the effective publication date of research outputs. As shown in this study, they have important repercussion to determine accurate time windows for altmetric research.
Several conclusions can be derived from this study. First, we conclude that not all altmetrics are fast and that they do not accumulate at the same speed, existing a fundamental differentiation between fast sources (e.g. Reddit, Twitter, News, Facebook, Google+, and Blogs) and slow sources (e.g. Policy documents, Q&A, Peer review, Wikipedia, Video, and F1000Prime). Another important conclusion of this study is that the accumulation velocity of different kinds of altmetric data varies across document types, subject fields, and research topics. The velocity of most altmetric data of Review papers is lower than that of Articles, while Editorial Material and Letter are generally the fastest document types in terms of altmetric reception. From the perspective of scientific fields, the velocity ranking of different data sources changes across subject fields, and most altmetric data sources show higher velocity values in the fields of Physical Sciences and Engineering and Life and Earth Sciences, and lower in Social Sciences and Humanities and Mathematics and Computer Science. Finally, with regards to individual research topics, substantial differences in the velocity of reception of altmetric events across topics have been identified, even among topics within the same broader field. Such topical difference in velocity suggests that it is worth studying the underlying reasons (e.g. hotness, controversies, scientific debates, media coverage, etc.) of why some topics within the same research area do receive social (media) attention much faster than others.
See more information about Immediacy Index at: https://clarivate.com/webofsciencegroup/blog/know-your-metrics-immediacy-index/.
See more information about Cited and Citing Half-Lives at: https://clarivate.com/webofsciencegroup/blog/a-closer-look-at-cited-and-citing-half-lives/.
This is the date on which a given altmetric event (e.g. a tweet, a News mention, a Blog citation, etc.) was posted online or published (for policy documents).
Extracted from personal communication with Euan Adie from Altmetric.com.
See more information about CWTS classification system at: https://www.leidenranking.com/information/fields.
Abramo, G., Cicero, T., & D’Angelo, C. A. (2011). Assessing the varying level of impact measurement accuracy as a function of the citation window length. Journal of Informetrics,5(4), 659–667.
Aksnes, D. W. (2003). Characteristics of highly cited papers. Research Evaluation,12(3), 159–170.
Alperin, J. P. (2015). Geographic variation in social media metrics: An analysis of Latin American journal articles. Aslib Journal of Information Management,67(3), 289–304.
Amat, C. (2008). Editorial and publication delay of papers submitted to 14 selected food research journals. Influence of online posting. Scientometrics,74(3), 379–389.
Aversa, E. (1985). Citation patterns of highly cited papers and their relationship to literature aging: A study of the working literature. Scientometrics,7(3–6), 383–389.
Björk, B. C., & Solomon, D. (2013). The publishing delay in scholarly peer-reviewed journals. Journal of Informetrics,7(4), 914–923.
Bornmann, L. (2014). Do altmetrics point to the broader impact of research? An overview of benefits and disadvantages of altmetrics. Journal of Informetrics,8(4), 895–903.
Colavizza, G., & Franceschet, M. (2016). Clustering citation histories in the physical review. Journal of Informetrics,10(4), 1037–1051.
Costas, R., van Leeuwen, T. N., & van Raan, A. F. (2010). Is scientific literature subject to a ‘Sell-By-Date’? A general methodology to analyze the ‘durability’ of scientific documents. Journal of the American Society for Information Science and Technology,61(2), 329–339.
Costas, R., Zahedi, Z., & Wouters, P. (2015a). Do “altmetrics” correlate with citations? Extensive comparison of altmetric indicators with citations from a multidisciplinary perspective. Journal of the Association for Information Science and Technology,66(10), 2003–2019.
Costas, R., Zahedi, Z., & Wouters, P. (2015b). The thematic orientation of publications mentioned on social media. Aslib Journal of Information Management,67(3), 260–288.
Darling, E. S., Shiffman, D., Côté, I. M., & Drew, J. A. (2013). The role of Twitter in the life cycle of a scientific publication. PeerJ PrePrints,1, e16v1.
Didegah, F., & Thelwall, M. (2018). Co-saved, co-tweeted, and co-cited networks. Journal of the Association for Information Science and Technology,69(8), 959–973.
Fang, Z., & Costas, R. (2018). Studying the posts accumulation patterns of Altmetric.com data sources. In The 2018 altmetrics workshop (Altmetrics18), London, UK. Retrieved from http://altmetrics.org/wp-content/uploads/2018/04/altmetrics18_paper_5_Fang.pdf.
Garfield, E. (1980). Premature discovery or delayed recognition-Why. Current Contents,21, 5–10.
Glänzel, W. (2004). Towards a model for diachronous and synchronous citation analyses. Scientometrics,60(3), 511–522.
Glänzel, W., & Schoepflin, U. (1995). A bibliometric study on ageing and reception processes of scientific literature. Journal of Information Science,21(1), 37–53.
Haustein, S. (2016). Grand challenges in altmetrics: Heterogeneity, data quality and dependencies. Scientometrics,108(1), 413–423.
Haustein, S. (2019). Scholarly Twitter metrics. In W. Glänzel, H. F. Moed, U. Schmoch, & M. Thelwall (Eds.), Springer handbook of science and technology indicators (pp. 729–760). Heidelberg: Springer. Retrieved from http://arxiv.org/abs/1806.02201.
Haustein, S., Bowman, T. D., & Costas, R. (2015a). When is an article actually published? An analysis of online availability, publication, and indexation dates. In Proceedings of the 15th international conference on scientometrics and informetrics (ISSI), (pp. 1170–1179), Istanbul, Turkey. Retrieved from https://arxiv.org/abs/1505.00796.
Haustein, S., Bowman, T. D., & Costas, R. (2016). Interpreting “altmetrics”: viewing acts on social media through the lens of citation and social theories. In C. R. Sugimoto (Ed.), Theories of informetrics and scholarly communication: A Festschrift in honor of Blaise Cronin (pp. 372–405). Berlin: De Gruyter Mouton. Retrieved from https://arxiv.org/abs/1502.05701.
Haustein, S., Costas, R., & Larivière, V. (2015b). Characterizing social media metrics of scholarly papers: The effect of document properties and collaboration patterns. PLoS ONE,10(3), e0120495.
Haustein, S., Peters, I., Bar-Ilan, J., Priem, J., Shema, H., & Terliesner, J. (2014). Coverage and adoption of altmetrics sources in the bibliometric community. Scientometrics,101(2), 1145–1163.
Kurtz, M. J., Eichhorn, G., Accomazzi, A., Grant, C., Demleitner, M., Murray, S. S., et al. (2005). The bibliometric properties of article readership information. Journal of the American Society for Information Science and Technology,56(2), 111–128.
Lin, J., & Fenner, M. (2013). Altmetrics in evolution: Defining and redefining the ontology of article-level metrics. Information Standards Quarterly,25(2), 20–26.
Line, M. B. (1993). Changes in the use of literature with time—Obsolescence revisited. Library Trends,41(4), 665–683.
Maflahi, N., & Thelwall, M. (2018). How quickly do publications get read? The evolution of Mendeley reader counts for new articles. Journal of the Association for Information Science and Technology,69(1), 158–167.
Min, C., Sun, J., Pei, L., & Ding, Y. (2016). Measuring delayed recognition for papers: Uneven weighted summation and total citations. Journal of Informetrics,10(4), 1153–1165.
Moed, H. F. (2005). Statistical relationships between downloads and citations at the level of individual documents within a single journal. Journal of the American Society for Information Science and Technology,56(10), 1088–1097.
Mohammadi, E., & Thelwall, M. (2014). Mendeley readership altmetrics for the social sciences and humanities: Research evaluation and knowledge flows. Journal of the Association for Information Science and Technology,65(8), 1627–1638.
Ortega, J. L. (2018). The life cycle of altmetric impact: A longitudinal study of six metrics from PlumX. Journal of Informetrics,12(3), 579–589.
Priem, J., Taraborelli, D., Groth, P., & Neylon, C. (2010). Altmetrics: A manifesto. Retrieved from http://altmetrics.org/manifesto/. Accessed 26 Nov 2019.
Robinson-Garcia, N., Arroyo-Machado, W., & Torres-Salinas, D. (2019). Mapping social media attention in microbiology: Identifying main topics and actors. FEMS Microbiology Letters,366(7), fnz075.
Rowlands, I., & Nicholas, D. (2007). The missing link: Journal usage metrics. Aslib Proceedings,59(3), 222–228.
Schloegl, C., & Gorraiz, J. (2010). Comparison of citation and usage indicators: The case of oncology journals. Scientometrics,82(3), 567–580.
Schloegl, C., & Gorraiz, J. (2011). Global usage versus global citation metrics: The case of pharmacology journals. Journal of the American Society for Information Science and Technology,62(1), 161–170.
Schlögl, C., Gorraiz, J., Gumpenberger, C., Jack, K., & Kraker, P. (2014). Comparison of downloads, citations and readership data for two information systems journals. Scientometrics,101(2), 1113–1128.
Schubert, A., & Glänzel, W. (1986). Mean response time—A new indicator of journal citation speed with application to physics journals. Czechoslovak Journal of Physics B,36(1), 121–125.
Shuai, X., Pepe, A., & Bollen, J. (2012). How the scientific community reacts to newly submitted preprints: Article downloads, twitter mentions, and citations. PLoS ONE,7(11), e47523.
Stinson, E. R., & Lancaster, F. W. (1987). Synchronous versus diachronous methods in the measurement of obsolescence by citation studies. Journal of Information Science,13(2), 65–74.
Sun, J., Min, C., & Li, J. (2016). A vector for measuring obsolescence of scientific articles. Scientometrics,107(2), 745–757.
Thelwall, M. (2017). Are Mendeley reader counts high enough for research evaluations when articles are published? Aslib Journal of Information Management,69(2), 174–183.
Thelwall, M., Haustein, S., Larivière, V., & Sugimoto, C. R. (2013). Do altmetrics work? Twitter and ten other social web services. PLoS ONE,8(5), e64841.
Waltman, L., & Van Eck, N. J. (2012). A new methodology for constructing a publication-level classification system of science. Journal of the American Society for Information Science and Technology,63(12), 2378–2392.
Wan, J. K., Hua, P. H., Rousseau, R., & Sun, X. K. (2010). The journal download immediacy index (DII): Experiences using a Chinese full-text database. Scientometrics,82(3), 555–566.
Wang, J. (2013). Citation time window choice for research impact evaluation. Scientometrics,94(3), 851–872.
Wang, X., Fang, Z., & Guo, X. (2016a). Tracking the digital footprints to scholarly articles from social media. Scientometrics,109(2), 1365–1376.
Wang, X., Fang, Z., & Sun, X. (2016b). Usage patterns of scholarly articles on Web of Science: A study on Web of Science usage count. Scientometrics,109(2), 917–926.
Wang, X., Mao, W., Xu, S., & Zhang, C. (2014). Usage history of scientific literature: Nature metrics and metrics of Nature publications. Scientometrics,98(3), 1923–1933.
Wouters, P., & Costas, R. (2012). Users, narcissism and control-tracking the impact of scholarly publications in the 21st century. Utrecht: SURFfoundation. Retrieved from http://research-acumen.eu/wp-content/uploads/Users-narcissism-and-control.pdf.
Wouters, P., Zahedi, Z., & Costas, R. (2019). Social media metrics for new research evaluation. In W. Glänzel, H. F. Moed, U. Schmoch, & M. Thelwall (Eds.), Springer Handbook of science and technology indicators (pp. 687–713). Heidelberg: Springer. Retrieved from http://arxiv.org/abs/1806.10541.
Yu, H., Xu, S., Xiao, T., Hemminger, B. M., & Yang, S. (2017). Global science discussed in local altmetrics: Weibo and its comparison with Twitter. Journal of Informetrics,11(2), 466–482.
Zahedi, Z., Costas, R., & Wouters, P. (2014). How well developed are altmetrics? A cross-disciplinary analysis of the presence of ‘alternative metrics’ in scientific publications. Scientometrics,101(2), 1491–1513.
Zhichao Fang is financially supported by the China Scholarship Council (Grant No. 201706060201). Rodrigo Costas is partially funded by the South African DST-NRF Centre of Excellence in Scientometrics and Science, Technology and Innovation Policy (SciSTIP). The authors thank Prof. Paul Wouters (Leiden University) for valuable suggestions, thank the anonymous reviewer for helpful comments, and thank Altmetric.com for providing the altmetric data of scientific publications.
About this article
Cite this article
Fang, Z., Costas, R. Studying the accumulation velocity of altmetric data tracked by Altmetric.com. Scientometrics 123, 1077–1101 (2020). https://doi.org/10.1007/s11192-020-03405-9
- Data accumulation speed
- Velocity Index
- Altmetric half-life
- Time delay