Introduction

More than a decade after the release of the Altmetrics Manifesto (Priem et al., 2010), metrics derived from online mentions of research are now commonplace on the webpages of scholarly articles. However, while these data are collected through Altmetric, Elsevier’s PlumX, and Crossref Event Data, the scholarly community continues to grapple with understanding their limits and potential uses. Although often extracted from openly available sources (i.e., web pages or public social media posts), the uncontrolled and ever-changing nature of the web makes collecting and analyzing Altmetric data especially challenging. Whereas scholarly citation data is extracted from fairly well-structured documents that were created by scholars with the explicit intention of connecting documents using established professional conventions, mentions of research found on the web lack any such common intention or conventions. As a result, they are challenging to analyze and interpret (NISO, 2016). Further complicating matters, the algorithms used to identify mentions are not generally publicly shared, which casts further doubt on what is an already messy problem.

Despite these challenges, news coverage-related Altmetrics have the potential to offer promising insights into “public engagement” with research—an important but complex form of impact to conceptualize and assess (Mahony & Stephansen, 2017). Journalists have long played a gatekeeping role in society, as their decisions on what to cover can determine which information makes its way into public discourse (Shoemaker & Vos, 2009). While social media may be challenging this gatekeeping role to some extent (Bruns, 2018), news coverage remains one of the most common sources citizens use to learn about science (Covens et al., 2018; Funk et al., 2017). In part, this reliance on the news may be linked to journalists' efforts to make science more accessible. By providing relevant, understandable, and contextualized coverage of complex findings, journalists can play a “knowledge broker” role in society, allowing publics to comprehend and use research in their daily lives (Gesualdo et al., 2020; Yanovitzky & Weber, 2019). News-based metrics may thus offer a meaningful way to examine a wider societal impact of research than is possible through other Altmetrics (Casino, 2018). Yet, despite the potential of news-based Altmetrics, there have been concerns about the quality of this data source (Ortega, 2019a, 2020a), as little is known about what is actually being captured. This lack of empirical evidence into the quality of news mention data is becoming more concerning as the body of scholarship that relies on this data source grows. This study seeks to help fill this knowledge gap by assessing the quality of the news mention data provided by Altmetric.com, using manual content analysis of news stories as a comparison method.

In particular, this study supports individuals who rely on Altmetric to understand journalistic use of research by examining the precision and recall of the company’s news mention data. Using a manual content analysis of 400 science news stories as a comparison method, it addresses the following research questions:

  • RQ1: What proportion of Altmetric research mentions represent actual mentions (i.e., what is Altmetric’s precision)?

  • RQ2: What proportion of actual mentions does Altmetric successfully identify (i.e., what is Altmetric’s recall)?

  • RQ3: What characteristics of the research mention (e.g., presence of a hyperlink, description of research as “a study,” inclusion of a journal title, author name, or publication date) influence its likelihood of being successfully identified?

Literature review

Assessing the quality of Altmetric provider data

A first challenge in determining the quality of Altmetric data is that there is no formally agreed-upon definition of what constitutes a research “mention” and that any such definition is bound to be different for every social media platform or website being examined. However, a mention typically includes research information such as author names, journal titles, and study timeframes, or hyperlinks to research items (e.g., journal articles, datasets, images, white papers, reports). Altmetric data providers also vary in the methods they use to collect and process these mentions, meaning that different providers can yield different event data for the same research outputs (Karmakar et al., 2021). While there is evidence that differences between providers are decreasing over time (Bar-Ilan et al., 2019), concerns about data quality remain (Barata, 2019; Torres-Salinas et al., 2018a; Williams, 2017). Despite these concerns, the use of Altmetric data in research studies remains common practice, likely because of the convenience of the data source and because of persistent interest in using metrics for research assessment, despite their potentially pernicious effects (Fitzpatrick, 2021; Hatch & Curry, 2020).

Among Altmetric data providers, Altmetric.com (or “Altmetric”) is the most popular, used in more than half of Altmetrics studies analyzed in a recent meta-analysis (Ortega, 2020b). Altmetric may be the preferred service because it gathers more mentions of research on Twitter, in blogs, and in news articles, relative to other popular providers (Karmakar et al., 2021; Ortega, 2020b; Zahedi et al., 2014). However, Altmetric’s data are also imperfect. For example, the platform only tracks Mendeley readership for research outputs that have already been mentioned on at least one social network, overlooking a substantial number of readers (Bar-Ilan et al., 2019). Altmetric data also appears to be particularly problematic when it comes to certain types of research items, such as scholarly monographs (Torres-Salinas et al., 2018b) and non-English research (Barata, 2019).

More recently, and more relevant to this study, a series of studies by Ortega (2019a, 2020b, 2021) have raised concerns about the quality of Altmetric’s news mention data in particular. Although Altmetric appears to collect more mentions of research in news stories than other providers, it does so for a smaller proportion of research items (e.g., peer-reviewed papers, preprints, monographs, clinical trials) and sometimes categorizes sources as both blogs and news, resulting in overlaps in event data (Ortega, 2019b). As many as 36% of links to news mentions on Altmetric are broken, a problem that is exacerbated for older mentions that were collected through external parties that have since become defunct (Ortega, 2019a). While Lehmkuhl and Promies (2020) offer some confidence in Altmetric’s ability to identify instances where research is not mentioned (i.e., its recall), their study is limited by a lack of understanding of the recall of the comparison data source used (i.e., the Nexis database).

In addition to these concerns, Altmetric appears to be biased towards news outlets that are published in English, based in English-speaking countries, and focused on general-interest issues, although its news data are still less biased than that of its competitors, PlumX and CED (Ortega, 2020a, 2021). In fact, news mention data gathered from different providers can differ widely; one study found that the Spearman correlation of Altmetric’s news data and that of its competitor, PlumX, was only 0.11 (Meschede & Siebenlist, 2018). Biases in news data, regardless of the provider, are likely mostly a product of the selection of news outlets being tracked, as providers like Altmetric do not automatically gather mentions across the entire web, but rather in “a manually curated collection of sources” (Altmetric, 2020a). They may also be exacerbated by the algorithms used to identify mentions, which could themselves work better in English texts that follow certain journalistic conventions. As others have noted, it is unclear how representative this “completely unsystematic selection of media titles” is of the wider online media landscape (Lehmkuhl & Promies, 2020, p. 15), particularly given that many of the sources Altmetric counts as “news” outlets are news agencies or highly specialized media focusing on scientific literature (Robinson-Garcia et al., 2019). It is also likely that the quality of Altmetric news data changes over time. In 2022, Altmetric reported tracking more than 7000 sources in over 165 countries (C. Williams, personal communication, February 18, 2022), but the list of sources is continually updated and revised. Over time, these updates may help improve the breadth of coverage that Altmetric is able to track and reduce some of the biases Ortega (2020a) identified; yet, it remains unknown when, or whether, they will be eliminated.

Collectively, the studies described above provide important insights into the limitations of Altmetric news mention data in terms of the sources it relies on; however, they do not address other possible limitations, such as those related to the methods Altmetric uses to identify mentions. According to Altmetric, news mentions are collected automatically, through a mixture of link matching—identifying URLs in news stories that link to research outputs—and text mining—crawling the stories for text-based descriptions of research-related metadata (Altmetric, 2020b). While these methods have strengths, they also have weaknesses. In particular, the second approach likely misses actual mentions of research, as any given news story “must include at least the name of an author, the title of a journal, and a publication date” in order for Altmetric’s text mining technology to classify it as a mention (Altmetric, 2020b). The norms, values, and goals of journalists differ from those of scientists (Hansen, 1994); as a result, they may prioritize telling interesting, informative, and entertaining stories over providing detailed bibliometric information about the research they cite. Although journalists often include at least one of the details used by Altmetric’s text mining technology, including all three is uncommon (Matthias et al., 2020). Instead, they often use “general terms” to describe research, referring to “scientific studies” rather than specific research institutions involved in the work (De Dobbelaer et al., 2018). Altmetric’s text mining technology likely overlooks at least a proportion of these text-based news mentions. To our knowledge, no empirical study has documented the frequency of these missed mentions.

Previous work that relies on Altmetric news mention data

Altmetric data is increasingly used to conduct research and assess its broader, societal “impact” (Ortega, 2020b); flawed data compromises the integrity of both of these activities. Although the use of Altmetric data as a measure of social impact can be problematic, the use of the data as a source for research can yield rich insights into where, how often, and among whom research circulates online. Altmetric data has been used to find that research articles first posted as preprints receive more Altmetric attention than those that were never preprinted (Fraser et al., 2020; Fu & Hughey, 2019). It has also been used to compare online attention to retracted papers, finding that retracted articles were 1.2–7.4 times more likely than matched, unretracted articles to receive Altmetric attention (Serghiou et al., 2021). More recent work relied on Altmetric data to document online attention to COVID-19-related preprints, demonstrating that unreviewed COVID-19 studies received far more attention on Twitter, news sites, and blogs than preprints on other topics (Fraser et al., 2021; Sevryugina & Dicks, 2021).

When it comes to news mention data specifically, providers such as Altmetric allow scholars to examine important questions at the intersection of journalism and science, such as the newsworthiness of different types of research items, the “media impact” of particular countries or publications, and more (Casino, 2018). A small but growing body of research has taken advantage of these opportunities. For example, Maggio et al. (2019) relied on Altmetric data to examine online media coverage of US-government funded cancer research, finding a mismatch between the types of cancers that received the most media coverage and those with the highest incidence rates. Schultz (2021) used Altmetric to understand the relationships between open access (OA) status and the number of news mentions of articles in high-impact journals, finding that OA articles tended to receive more mentions than those published as completely closed access. Other news-focused studies have used Altmetric to not only identify the amount of coverage that research outputs receive but to generate corpora of stories that can later be analyzed to understand how journalists portray those outputs. Using this approach, Moorhead et al. (2021) found that traditional, legacy news sources included significantly more mentions of research on common cancers than digital native news sources. Matthias et al. (2020) used Altmetric data to identify and analyze mentions of research on opioid-related disorders in US and Canadian media, finding that this research was most often portrayed as “valid science”, with little discussion of study methods or limitations. More recently, scholars have used Altmetric data to understand how COVID-19 preprints are framed in English-language (Fleerackers et al., 2021) and Brazilian news media (Oliveira et al., 2021), finding that journalists inconsistently disclose the unreviewed nature of the research they cite. Studies such as this highlight the potential value of Altmetric News data but also underscore the importance of assessing the quality of the data source.

Methodology

To conduct the study, we gathered all articles published in the science and health sections of the following 8 news media outlets during March–April 2021: The Guardian (Science Section), HealthDay, IFLScience, MedPage Today, News Medical, New York Times (Science Section), Popular Science, and Wired. These news publications were selected for their science and health focus, as well as their representation of the changing media landscape (i.e., The Guardian and New York Times as traditional, legacy news organizations; Popular Science and Wired as historically print-only science magazines; News Medical and MedPage Today as digital native health sites; and HealthDay and IFLScience as niche science and health blogs). In addition, of all the news sources Altmetric tracks, general-interest and specialized health outlets such as these also appear to cover the most research (Ortega, 2021). Between March and May 2021, we used a custom-built web-crawler to read the RSS and Twitter feeds of these sites to identify 5172 articles and associated metadata (i.e., URLs, dates of publication, authors, article titles; Enkhbayar, 2022). The eight outlets appeared to vary widely in how frequently they posted content. Most outlets published around 50 articles per week during our collection period. However, both Popular Science and News Medical stood out with more than 145 stories a week, while Wired only published 11 articles a week. To ensure balanced representation of outlets within the dataset, and to make the manual content analysis more manageable, a random sample of 50 articles from each publication (400 articles total) was used to perform the study.

Next, adopting methods utilized in systematic reviews (Page et al., 2021) two independent coders (AF and LN) read and analyzed each of the 400 news stories using a custom codebook (see Supplementary File 1). This codebook was adapted from two codebooks used in the authors’ previous studies of science news coverage, both of which relied on Altmetric news data (Fleerackers et al., 2020; Matthias et al., 2019). Relevant codes were drawn from these codebooks to create a working codebook, which was then refined through an iterative process of coding subsamples of news stories, comparing results, and refining the coding instructions to ensure better alignment between coders. The finalized codebook was then used to recode all 400 news stories in the sample. For each news story, each coder looked for mentions of research in the form of either hyperlinks to academic publications or text-based descriptions of research (e.g., “a study”, “new evidence”, etc.). They then used either the hyperlink provided or a standardized web search method developed by the authors (outlined in Supplementary File 1) to identify the specific study mentioned and saved the corresponding identifier (e.g., DOI, PubMed ID, arXiv ID, ISBN); each research mention was then coded to assess the contextual information provided in the news story (e.g., Was an author name mentioned? Did the story include the title of the journal where the research was published?). The two coders’ coding sheets were then compared and any discrepancies in the coding were resolved through discussion. In instances in which consensus was unclear, LAM and LLM acted as a tiebreaker. The resulting dataset was thereafter considered the standard for comparison.

We used the Altmetric Explorer on September 9, 2021, to download the 10,021 research mentions that Altmetric identified in the same 8 outlets during March–April 2021. We then matched stories identified by Altmetric with our standard using the story URL and matched mentions using the research identifiers (e.g., DOI, PubMed ID, arXiv ID, ISBN). We looked at all mentions identified by Altmetric for which there was no match in the standard data to verify that the news story did not have a corresponding mention. We corrected instances where Altmetric had an erroneous identifier if it was clearly caused by a typographical error (e.g., DOIs or arXiv IDs that did not resolve due to brackets or spaces that were out of place) but did not correct instances when the record pointed to an incorrect research article. Certain outputs, such as books, can be associated with more than one identifier (e.g., multiple ISBNs for multiple versions); we reviewed these cases manually and replaced the identifier provided by the coders with the one provided by Altmetric. The final news mention data can be found online (Fleerackers et al., 2022).

Data were analyzed using Python’s pandasFootnote 1 package. Binary logistic regressions were calculated using Python’s statsmodelsFootnote 2 package. All analysis scripts can be found online (Alperin, 2022).

Results

We identified 502 research mentions among 228 of the 400 stories in our sample (the remaining 172 did not mention research). The number of stories with mentions and the number of mentions per story varied by outlet (Table 1); at one extreme, The Guardian only mentioned research in 18 (36%) of the 50 stories in the sample (with an average of 1.7 mentions in every story) and at the other, News Medical mentioned research in 43 (86%) of stories (with an average of 1.1 mentions per story). Importantly, it is possible that these outlets mentioned other research items that were not identifiable using either of our methods (i.e., Altmetric or content analysis). Specifically, the coders noted that many stories included phrases such as “studies have found” or “there’s a wealth of research,” which suggest that a body of evidence has been referenced but does not provide enough detail to identify specific research items. While we did not track such instances systematically, the results represented in Table 1 likely underestimate the true amount of research that is referenced in the news stories.

Table 1 Number of stories and mentions across news outlets

Outlets also varied in how they referred to the research outputs they mentioned (Table 2). For example, while outlets such as Wired, News Medical, and MedPage Today provided hyperlinks to almost all the research they mentioned, others did so less consistently, and one outlet (Health Day) never linked to research. In contrast, Health Day was among the most consistent when it came to describing bibliometric details—providing an author name and institution for almost every research mention, often alongside a journal and publication date—while other outlets did so less frequently.

Table 2 How research was mentioned across news outlets

Despite these differences, there were several observable trends in how outlets referred to the research they mentioned. The two most common ways to mention research items were to hyperlink to them and/or describe them with terms such as “research” or “studies.” Providing an author and an institution were the next most common strategies, followed by mentioning the journal or publication venue, and finally, indicating the publication date. Strategies commonly used together can be seen in Supplementary File 2 Fig. 1.

Precision and recall of Altmetric’s research mention data

Altmetric identified at least one mention in 163 (71%) of the 228 news stories with mentions. Within these stories, it correctly identified 349 (70%) of the 502 mentions in the standard dataset, while also identifying 21 incorrect mentions. It also missed identifying 153 mentions. Standard information retrieval measures (e.g., precision, recall, and F-score) of Altmetric’s research mention data are summarized in Table 3; common errors contributing to these scores are described in detail below.

Table 3 Precision and recall of Altmetric research mention data

False positives

In some instances (n = 21), Altmetric identified mentions to research that the two coders did not. A closer examination of these errors revealed that in most of these cases (n = 15) Altmetric had identified mentions to items that were not research, but rather journalistic stories in sources like The Conversation or Nature News. In three of the remaining cases, Altmetric had identified an erroneous research item (i.e., research that was not actually mentioned in the story) that had been published in the same journal, by one or more of the same authors, and around the same time as the true research mention (most likely due to Altmetric’s reliance on author names and journals for text mining). The final three cases were duplicates, in which Altmetric had identified the same mention twice for the same research item. This particular error is known to occur when news stories are revised or updated; even if only minor changes have taken place, Altmetric's system sees it as an entirely separate story, resulting in duplicate mentions (C. Williams, personal communication, February 18, 2022).

False negatives

False negatives, or instances in which Altmetric failed to identify research that both coders agreed was mentioned, were far more frequent than false positives (n = 153). While it is not possible to explain why these mentions were missed, we observed common trends in the types of mentions that most often resulted in false negatives. For example, Altmetric often failed to identify mentions of research in news stories that described the results of the research but provided little or no bibliographic information (i.e., no author name, journal, publication date) and did not include a hyperlink. Such mentions often provided detailed statistical results, methods, and sample information that also appeared in the research article’s abstract; referred to a famous study by name (e.g., the Stanford Prison Experiment); or described highly unusual, and thus easily searchable, research findings (e.g., two species of sea slug were discovered to be capable of regrowing new bodies from their severed heads). In addition, conference presentations, books, dissertations, and non-English research items were often missed, particularly those that were not associated with a DOI. Altmetric also failed to identify research when the news story included a link to a press release describing the research, but no link to the research itself. In a few cases, it was unclear why Altmetric had missed the mention, as all of the information needed to identify the research had been provided (i.e., a hyperlink or text-based description of the journal, author name, and publication date).

Accuracy across news outlets

Perhaps because news outlets differed in their use of research (as discussed above), the accuracy of Altmetric’s data varied across outlets (Table 4). In particular, Altmetric was most successful in identifying mentions of research in stories from outlets such as News Medical and Wired, and less accurate when it came to outlets such as Popular Science or The Guardian. We investigate some of the factors that could be driving these differences in the following section.

Table 4 Precision, recall, and accuracy (F-score) by news outlet

Characteristics of research mentions that influence accuracy

To answer RQ3, we calculated a logistic regression that examined whether the probability of Altmetric identifying a research item depended on how the research was mentioned in the story. More formally, we calculated a model in the form P(Y = 1) = β0 + β1x1 + β2x2 + β3x3 + β3x3 + β4x4 + β5x5 + β6x6, where Y is a binary outcome variable coded as 1 if Altmetric identified the correct mention and 0 otherwise, and x1…x6 are a set of predictor variables corresponding to the six characteristics examined, each coded as 1 if we identified the characteristic and 0 otherwise. We found that Altmetric was significantly more likely to identify the research mention when it included a link (odds ratio = 53.8, p < 0.001), when the journal name was mentioned (odds ratio = 3.9, p = 0.003), and when the author was mentioned (odds ratio = 4.2, p = 0.010). The coefficients for the other three predictor variables were not significant (see Table 5). That is, Altmetric was no more or less likely to successfully identify a mention if the research in question was described as a study, included the author’s institutional affiliation, or provided a publication date.

Table 5 Results of logistic regression

Discussion

This study assessed the accuracy of Altmetric’s news mention category, a data source increasingly used by scholars in a wide range of fields but about which relatively little is known. Using a manual content analysis of 400 news stories as a comparison method, we analyzed the precision and recall with which Altmetric identified mentions of research for a set of 8 news outlets, as well as which characteristics of how the research was mentioned influenced its likelihood of being successfully identified.

Our findings suggest that, when working from a predefined list of news outlets, and applying some caveats and several limitations, Altmetric can be a useful and relatively reliable source of news mention data. Doing so allows scholars to easily identify mentions of research in those outlets with very high precision and with variable, but in many cases, acceptable recall. That is, given a set of outlets similar to the ones we studied, Altmetric appears to use a conservative approach that ensures accuracy in their data at the expense of completeness. As a result, research mentions found by Altmetric for any given news outlet can be considered to represent a reliable lower bound of all research mentions for that outlet.

This finding is important because Altmetric offers scholars the potential to save considerable time and resources at the data collection stage of their research. The comparison method used for this study—manual content analysis—proved to be complex and resource-intensive. It took two researchers around 44 h each to identify the 502 mentions in the final data. Of these, 36 h were needed to identify and code the mentions and the remaining 8 were spent discussing discrepancies to arrive at a consensus. In comparison, identifying mentions via the Altmetric Explorer took about 5 min. In addition, our approach required the selection and curation of relevant news sources and a custom set of scripts to collect their published articles. While we relied on the publicly available dissemination channels (e.g., APIs, RSS feeds, or Twitter feeds) to crawl relevant content, Altmetric might have more reliable access to such data through paid third-party services or direct agreements with publishers.

That said, our analysis offers assurances about the accuracy of Altmetric news mention data only under a limited set of conditions and should only be relied upon with an understanding of some important limitations. In particular, our analysis does not offer assurances about Altmetric news mention data as a whole. Scholars seeking to use Altmetric in their research should thus consider our results alongside other known limitations of this data source, including its linguistic, geographic, and disciplinary biases (Ortega, 2020a) and its high incidence of broken links (Ortega, 2019a). We note that there may also be other limitations of Altmetric news mention data that are not yet known.

In addition to assessing precision and recall, this study provides insight into the characteristics of how research is mentioned that influence its likelihood of being successfully identified by Altmetric. In particular, this data source seems to be most reliable for mentions of research that are accompanied by a hyperlink, followed by mentions of research that include an author name or journal title. In some ways, this finding is unsurprising; Altmetric’s website explicitly states that, in cases where no hyperlink is present, a research mention must include an author name, journal title, and study date to be identified (Altmetric, 2020b). However, it is interesting that the presence of a study date made no significant difference in the likelihood that Altmetric would correctly identify a given research output. It is unclear why this is the case, but it could be related to how news stories tend to describe publication dates. In particular, many of the stories we analyzed did not include an explicit year of publication; instead, they referenced publication dates using statements such as “released last week,” “published Monday,” or, even more frequently, simply “new.” Future research is needed to better understand how or whether these more conversational references to study dates influence Altmetric’s ability to identify the research. More broadly, our results suggest that Altmetric struggles to identify mentions of research that offer few or no bibliographic details (for example, mentions that describe research studies using only their key findings). Our coders identified this as a relatively uncommon practice; only 30 (6%) of the 502 research mentions in our data set did not include any bibliographic information, but its prevalence could be different among a different set of stories.

In addition, Altmetric news mention data for scholarly books appears to be particularly problematic. Books are a challenge in part because multiple ISBNs may be associated with the same book, as noted by other scholars (Torres-Salinas et al., 2018b). However, our study suggests an additional challenge of distinguishing between mentions of academic and trade books. In our data, Altmetric had an inflated number of research mentions in the data set from books that both coders agreed were not scholarly (e.g., self-help or health books that were written by authors with academic affiliations but were published by trade publishers rather than scholarly presses). This is an important limitation that will likely be exacerbated when exploring questions related to humanities research, whose papers are less frequently mentioned in Altmetric sources (Thelwall, 2018), and where monographs are a more common form of scholarship (Hicks, 2005). As such, extra caution should be taken by those who wish to use Altmetric data to measure academic success or “impact”.

Finally, we note that in addition to tracking news mentions of research items, such as journal articles or preprints, Altmetric sometimes also tracks news mentions of blog posts or news stories about research items—so-called “second-order citations” (Priem & Costello, 2010). While understanding the circulation of science news or blog posts may be useful in certain contexts (e.g., for examining how science journalists cite each other’s work), it may add confusion in studies seeking to understand media attention to scholarly research. Researchers using Altmetric for this purpose should be aware that the data may contain both first- and second-order citations of research and, if appropriate, filter out those outputs marked as “news” before performing any analysis. They should also be aware that Altmetric only tracks certain types of news stories and blog posts mentioning research (e.g., those published by The Conversation, Science, IEEE Spectrum News) and not other types of second-order citations, such as press releases about research.

Limitations

These findings must be understood alongside several limitations. First, we report on the quality of Altmetric news data for a predefined set of 8 news outlets, while Altmetric collects research mentions for more than 7000. Although these 8 sources were selected because they cover a range of outlet types that are likely to mention research, they are not representative of all outlets that might mention research, nor all outlets tracked by Altmetric. For example, it is likely that Altmetric’s precision and recall for identifying mentions of research in non-English outlets differ from those found in this study. Additionally, as noted by Lehmkuhl and Promies (2020), Altmetric data includes sources that could arguably not be considered “news,” including content aggregators, such as Foreign Affairs New Zealand; press alert services, such as EurekAlert; websites that allow anyone with an account to produce content, such as Medium; and outlets known to promote misinformation, such as Zero Hedge (Cranley, 2020). The data can also include duplicate links or discrete links that highlight duplicate mentions of research (e.g., research briefs or press releases with minor changes in text or publishing timestamps). Future research should extend these findings using a larger or distinct subset of news outlets, including content aggregators, particularly given that precision and recall do appear to differ somewhat across outlets.

Finally, this study examined how characteristics of the news story (e.g., outlet) and news mention (e.g., presence of a hyperlink, author name) influence Altmetric’s ability to accurately identify mentions of research, but we did not examine any characteristics of the research items themselves. Although we provide some qualitative evidence that certain kinds of outputs (e.g., books, conference papers) appear to generate more problems for Altmetric than others, future research should more rigorously examine the factors that influence a given output’s likelihood of being successfully identified (e.g., type of research output, journal it is published in, how old it is, language, etc.).