Introduction

The rise of the Altmetrics allowed measuring broader impact of scholarly publications (Adie, 2016; Holmberg, 2015; Sugimoto et al., 2017; Warren et al., 2017), expanding the cited publications to be analyzed (from journal articles to any scholarly output), the citing publications to be considered (from scholarly publications to non-scholarly publications), and the nature of the performance metrics available (from bibliographic citations to usage, dissemination, comments, discussion, rating or connectivity), leading to a new generation of research metrics (Orduña-Malea et al., 2016; Priem & Hemminger, 2010) led by social acts (Haustein et al., 2016a).

By embedding scholarly works or the online research objects representing them (e.g. the full-text article’s URL) on social applications and platforms (Haustein, Bowman and Costas, 2016a), meta-researchers were allowed to capture signals of the interest, dissemination and engagement of the research endeavor beyond the scholarly environment (Tahamtan & Bornmann, 2020). This way, the development of the Altmetrics field unraveled data prevalence differences between social media sources (Haustein et al., 2015; Thelwall, 2018), data aggregators (Zahedi, et al., 2015; Ortega, 2018a, 2018b, 2020), and data accumulation velocity differences (Fang & Costas, 2020), being those differences shaped by the characteristics of each discipline (Htoo & Na, 2017; Orduna-Malea & Delgado López-Cózar, 2019; Zahedi et al., 2014).

Patents have been scarcely studied within this social analytical framework. The role of patents in Altmetrics studies has been limited to being a source of citations (i.e. the citing publications) for the documents being monitored (i.e. the cited publications), in a similar way than the role played in Patentometrics classic studies, were references from patents to papers are analyzed (Hammarfelt, 2021). Earlier webometric studies on patents have also focused on URL mentions from patents to other online resources (Orduna‐Malea, et al., 2017; Font-Julián et al., 2022).

Altmetrics reflects an evolution from Scientometrics 1.0 (Fig. 1; mode A) to Scientometrics 2.0 or social Scientometrics (Fig. 1; mode B). However, Patentometrics 1.0 (Fig. 1; mode C) did not evolve into a social Patentometrics model (Fig. 1; mode D). This contribution intends precisely to lay the foundations for social Patentometrics studies.

Fig. 1
figure 1

Basic models of Scientometrics and Patentometrics studies. Note: arrows connect citing documents to cited documents

The lack of social media metrics studies of patents might be due to the following considerations:

Purpose of patents

“A patent is a right granted to the owner of an invention that prevents others from making, using, importing, or selling the invention without the inventor’s permission” (Marley, 2014), limited to one specific jurisdiction during a limited period. Therefore, patents are legal documents, whose generation, consumption and impact are guided by dynamics different from those of the scholarly community.

Access to patents

While patents are made available to the public at large as part of the disclosure obligation of inventors (Graham and Hedge, 2015), most web-based patent databases exhibit lack of advance search features (Marley, 2014). Likewise, other databases are offered by the patent office where the patents were submitted to request protection, limiting its coverage to that jurisdiction, and preventing thus the analysis of family patents (Martínez, 2011). In other cases, web-based patent databases offer only online descriptive metadata, with no online access to the full-text document.

Interest on patents

As part of the disclosure obligation, inventors are required to accomplish a few conditions, which can vary slightly from one patent office to another. For example, patentees in the United States must satisfy three conditions (Ouellette, 2012): a written description (disclosing the technologic knowledge upon which the patent is based, and demonstrating that the patentee is in possession of the invention that is claimed), enablement (how to make and how to use the invention), and best mode (the patent must include “the best mode contemplated by the inventor or joint inventor of carrying out the invention”). This makes patents useful documents with technical solutions to problems, offering researchers both scientific and legal benefits for reading patents (Ouellette, 2017). However, the literature has also evidenced potential concerns related to the information offered. Patents might be perceived as hard to read and understand, vague, with extensive legal jargon included. In addition, there is a perception that reading patents might lead to increased liability for ‘willful’ patent infringement (Ouellette, 2012). These issues might show patents as less attractive documents for researchers (Lemley, 2008), making them underutilized scientific resources.

The subsequent emergence of open full-text patent databases accessible on the Web (each patent holds a URL) and covering documents from many patent offices around the world, enabled webometrics and Altmetrics studies to be carried out, allowing the social Patentometrics model (Fig. 1; mode D). These databases include patent search facilities (e.g. The Lens), patent search engines (e.g. Google Patents, Yandex Patents) and patent databases (e.g. Free Patents Online, Trea).

URLs linking to openly available full-text patents can be subsequently embedded on social media platforms (e.g. Twitter, LinkedIn), not only enhancing the dissemination of the patents but also allowing the capture of usage, interest and readership evidence, an aspect scarcely covered by the literature–basically through survey data (Ouellette, 2017), with no quantitative studies to date.

Among the currently available social media platforms, Twitter stands out due to its communicative nature and wide use. As of May, 2022, Twitter exhibits a high number of users (around 229 million monetizable daily active users) and tweets published (around 850 million tweets per day). Moreover, the existence of an Academic API and the availability of a wide variety of engagement metrics make Twitter a suitable platform to collect interest on patent publications.

Following in the model proposed by Fang et al. (2021), the Fig. 2 shows how Twitter can be employed to connect two landscapes: the innovation landscape and the Twitter landscape. Each online full-text patent holds a URL (research object) which can be embedded on one tweet. This tweet can be engaged by other users, who can like, retweet or reply to the original tweet. In addition, by clicking on the URL, users can be redirected to the patent. This model can collect three types of metrics: URL-based metrics (e.g. number of times a patent URL has been mentioned), tweet-based metrics (e.g. number of likes that a tweet mentioning patents has received), and patent-based metrics (e.g. number of visits that a patent has received from Twitter).

Fig. 2
figure 2

Role of Twitter in social patentometrics studies

The main objective of this study is to disclose how patents are disseminated on Twitter. Specifically, this work focuses on analyzing three elements: tweets linking to patents (e.g. what volume of tweets is generated, what type of tweets are published, what impact do they generate); users linking to patents (e.g. what type of users link to patents, what is their activity on Twitter); and patents linked from Twitter (e.g. which patents are linked more frequently, which Patent Offices obtain a greater diffusion, which are the main subjects covered by the tweeted patents). This exploratory and descriptive research intends to lay the foundations for the future design of engagement metrics aimed at understanding the interest, dissemination, and engagement of patent documents in social environments.

To address that main objective, the following research questions are set:

  • RQ1. What is the volume of tweets linking to patents?

  • RQ2. What is the impact of tweets linking to patents?

  • RQ3. What type of Twitter users link patents?

  • RQ4. Which patents are most linked from Twitter?

To answer the research questions established above, the Google Patents database is used as case study. This decision is based on its wide global coverage of patents, its ease of use, and the generation of friendly URLs for each patent, which have made Google Patents one of the main patent discovery tools for researchers (Ouellete, 2015).

Research background

Google patents: a global online full-text patent discovery tool

Google Patents is a search engine and discovering tool launched in December 14, 2006, that indexes full-text granted patents and patents applications. As of May 2022, Google Patents covers over 140 million patent publications from 105 patent offices around the globe. Full-text documents are indexed from 22 patent offices.

Google Patents includes advanced search engines, global litigation information and a Prior Art Finder Tool, which includes a copy of the “technical documents and books indexed in Google Scholar and Google Books, as well as documents included in the Prior Art Archive”.Footnote 1 For each patent, full-text, figures, the original PDF version, metadata, and citations are included. Patents with only non-English text have been machine-translated to English.Footnote 2

While Google Patents is currently used as data source for Patentometrics studies and literature reviews (e.g., Narayanankutty, 2019, 2022), the literature aimed at describing and characterizing its features from an informational perspective has been limited. Noruzi and Abdekhoda (2014) and Marley (2014) described its search functionalities, and Moskovkin et al. (2012) showed Google Patents as a patent-metric tool useful to analyze the patent activity of world-wide leading innovation companies. Other works have explored Google Patents as a tool to find patent citations to scholarly works (Kousha & Thelwall, 2015) or URL citations to university websites (Orduna‐Malea, Thelwall and Kousha, 2017).

Twitter: a social platform disseminating contents

Twitter is a real-time microblogging and social networking platform, launched in July 2006. Users can post brief plain text messages referred to as tweets, which can be liked or retweeted (with or without a quote) by other users. The platform collects metrics both at the user-level (e.g. number of followers, followings, tweets posted) and at the tweet-level (e.g. number of likes or retweets received) along with a wide variety of descriptive contextual metrics (e.g. tweet language, user location, etc.), making Twitter a potential tool to obtain scholarly metrics (Haustein, 2019).

Since the origin of Altmetrics, Twitter mentions have become one of the most important Altmetric events for scientific publications (Sugimoto et al., 2017), being the number of likes an important factor for measuring the social media activity of users around science (Díaz-Faes et al., 2019), although limitations such as the stability of twitter data (Fang et al., 2020a, 2020b) should be taken into account. Interactions between users and tweets around scholarly publications have been also analyzed (Friedrich et al., 2015; Haustein et al., 2015; Didegah et al., 2018; Hassan et al., 2021; Costas et al., 2021; Fang et al., 2021), including automatically generated content (Haustein et al., 2016b).

Despite the extensive literature focused on the scholarly use of Twitter, however, the dissemination and interest of patents in this social networking platform has not been studied to date.

Method

Tweets data collection

A python script was employed via the Academic Twitter API v2Footnote 3 (full-archive search endpoint) to collect all tweets containing a URL to a full-text patent available on Google Patents, using the following ULR seed: ‘patents.google.com/patent/*’. The data collection expanded from 26 March 2006 (the birth of Twitter) to 31 December 2021. Data extraction was carried out by 23 February 2022. Original, reply and quoted tweets were considered, while retweets were excluded as these tweets are re-publications that can distort the results obtained.

For each tweet, the following parameters were captured: author id, username, tweet id, conversation id, public metrics (number of retweets, replies, likes, and quotes received), Twitter type (original, reply, quoted), and the tweet text. Results were obtained in JSON format, which were subsequently transformed into CSV files via OpenRefine to be further analyzed. This process yielded 126,815 tweets from 26,106 users.

Accounts data collection

A second python script was built (Users by ID endpoint) to collect users’ descriptive data. For each Twitter user, the following data were captured: username, created at, lists, tweets, followers, and followings. Data was captured by 17 May 2022.

To characterize users’ behavior, the Botometer APIFootnote 4 was employed. Botometer is an application that scores Twitter users from 0 to 5 considering six variables,Footnote 5 and using language-independence features (Sayyadiharikandeh, 2020). Scores near 5 reflect a bot-like behavior, while scores near 0 reflect human-like behavior. To do this, the most productive Twitter users (those who published at least 10 tweets linking to patents) were analyzed. While this threshold might be perceived as a subjective decision, these users (536) are responsible of the 70.1% of all tweets collected, constituting a representative sample. Data was collected by 24 May 2022, using the display universal score.

Patents data collection

All URLs embedded in each of the 126,815 linking tweets were extracted, comprising 146,641 links, both to patents and other online resources. As Google Patents uses URL aliases,Footnote 6 a data cleansing process was needed to identify the patents tweeted, by extracting the patent ID embedded in each Google Patent URL, which includes the country/Area code (patent office), number constitution, and kind code. All the type codes related to the same patent number constitution were combined to facilitate the analysis. This process yielded 86,417 patents.

The Patent API v.1.2.7 offered by The LensFootnote 7 was used to collect descriptive data related to each patent. The following parameters were considered: publication date, patent type, patent language, patent status, and patent citations. Data were collected by 27 May 2022.

Results

RQ1. What is the volume of tweets linking to patents?

The first tweet containing a link to a Google Patents’ full-text patent appeared in 2015, date that coincides with a significant update of Google Patents, including its integration with Google Scholar.Footnote 8 Since then, 126,815 tweets have been published (Fig. 3), mainly original tweets (75.6%) and replies (23%). The year 2018 marks a milestone (23,888 tweets published), date that also coincides with another Google Patent’s update, in this case with the addition of global litigation information, via a partnership with Darts-IP,Footnote 9 currently part of Clarivate.

Fig. 3
figure 3

Number of linking tweets by tweet type over the years

RQ2. What is the impact of tweets linking to patents?

The engagement received by tweets differs according to the type of tweet. Quotes and replies receive in average (arithmetic mean) more likes and retweets than original tweets (Table 1), which might imply that tweets around conversation generate more interest on users. Due to the skewed data, geometric means are also offered. In this case, we observe that original tweets improve their engagement notably. However, as the geometric mean operates only with values greater than 0 (i.e. eliminating all tweets with no engagement), the results can be misleading and should be taken cautiously and jointly with the arithmetic means obtained.

Table 1 Average engagement metrics according to the tweet type (original, reply, quote)

The prevalence of the engagement metrics is limited (Table 2). Only 19.8% of all tweets linking to full-text patents have attracted at least one like. Likewise, 12.8% of tweets have received at least one reply, 9.1% of tweets have attracted at least one retweet, and 3.4% of tweets have received at least one quote. These percentages, however, have annually increased since 2017.

Table 2 Annual engagement metrics: likes, retweets, replies and quotes

All the engagement metrics measured show skewed distributions. The median values for all metrics and all years are zero. This way, few tweets achieve outstanding values (e.g. only 265 tweets attract more than 100 likes; only 105 tweets attract more than 100 retweets; and only 2 tweets attract more than 100 replies or quotes), while most tweets attract low interest: 12,819 tweets (50.9% of all tweets receiving likes) attract only one like.

The correlation (Spearman) between the engagement metrics is statistically significant, but moderate. Only the number of likes and retweets achieve a moderate positive correlation (Rs = 0.54; p-value: < 0.0001; α > 0.05) (Table 3a). When data is restricted to tweets with at least 10 likes received (Table 3b), results improve slightly, especially the correlation of quotes with the remaining metrics.

Table 3 Correlation (Spearman) between engagement metrics

A plausible reason for the low correlation values obtained is the low prevalence of the engagement metrics, previously observed. The high number of tweets with 0 or 1 likes/retweets/replies/quotes distorts the correlations achieved. However, when the number of likes received achieves a threshold (around 10), the correlation between the numbers of likes and retweets increases (Fig. 4). This threshold effect is hardly noticeable for the number of quotes and replies, which seem to reflect a different engagement dimension.

Fig. 4
figure 4

Scatter plot for engagement metrics: links count, retweets count, replies count, quotes count

RQ3. What type of twitter users link patents?

The 126,815 linking tweets have been published by 26,106 unique users. The number of unique users per year has increased notable since 2020. In 2021, a total of 10,006 unique users published 36,175 tweets linking to a full-text patent available on Google Patents (Fig. 5).

Fig. 5
figure 5

Number of Twitter users over the years

The distribution of linking tweets published per user is also quite skewed: 2 authors have published at least 100 tweets (high performers), while 19,656 users (75.3% of all users) have published only one linking tweet (sporadic users); 15.6% of all tweets come from sporadic users.

Two Twitter accounts (DailyPatent and uspatentbot) jointly publish 53.3% of all tweets (33,864 and 33,775 tweets, respectively), constituting the most influential users. These accounts are highly productive, do not follow other users, and most of their tweets published link to Google Patents (68.6% and 92.9%, respectively).

Considering the most productive users (Table 4), no specific characteristics regarding their behavior can be distinguished. We can find productive users with high number of followers and followings (e.g. tatzanx), productive users with high number of tweets published but few followers (e.g. DinahParums), productive users with low number of both followers and followings (e.g. covidventilator), or unproductive users with high number of followings (e.g. PPAtrading). Patents_bot and patentsexpiring are outlier accounts: all their published tweets link patents. In other cases, the linking patents amount for a low percentage of the total tweets published (e.g. COILPOD).

Table 4 Users publishing most linking tweets to Google Patents

A deep analysis of 536 moderate and highly performers (those publishing at least 10 tweets linking to Google Patents) reveals an obsolescence of users (32 accounts have been suspended or eliminated at the time of the data analysis), a remarkable presence of bots (or human publishing like a bot), with 10% of those 504 active accounts exhibiting a Botometer display universal score higher than 4, of which 32 are self-declared bots (Fig. 6), and a high presence of individual (445) over organizational (59) accounts.

Fig. 6
figure 6

Types of users publishing most linking tweets to Google Patents (N = 512). Note: each user can be categorized under more than one type

RQ4. Which patents are most linked from twitter?

The United States Patent and Trademark Office (UPSTO) is the jurisdiction receiving most links from Twitter (112,949 links to 81,156 patents), followed at a great distance by the World Intellectual Property Organization (WIPO), the China National Intellectual Property Administration (CNIPA), European Patent Office (EPO) and the Japan Patent Office (JPO). The remaining jurisdictions are linked in a minority way. Otherwise, it is noteworthy the low number of links to patents from Denmark (8), Austria (5), Finland (3), Brazil (2) and Belgium (0) jurisdictions, whose patents are indexed full-text on Google Patents.

Table 5 includes the number of patents tweeted and the number of links to unique patents tweeted per patent office. Even considering that each Patent measured includes all the patent applications and granted patents related the same patent ID, the number of patents linked from Twitter constitutes a small percentage of the patents covered by Google Patents for each jurisdiction.

Table 5 Number of patents tweeted by patent office

Human necessities (383 patents) and Physics (156 patents) are the subjects most covered by the patents most tweeted (those patents tweeted at least 5 times; N = 818). The presence of the remaining subjects is low (Table 6).

Table 6 Number of patents tweeted by Cooperative Patent Classification (CPC)

A Chinese written patent application (CN112220919), related to covid-19 (with CPC code “A” [human necessities] assigned), published in 2021 and still pending, is the full-text patent most tweeted. This fact evidences that neither the time of publication, nor patent citations count or status are decisive variables in obtaining a higher dissemination on Twitter. Table 7 delve into this issue by displaying the top 20 tweeted patents, including the status, the publication date, the first CPC code, the number of patent citations (both from Lens and Google Patents), and the total number of citations (from Google Scholar). As we observe, highly tweeted patents include old/recent, active/expired or highly/lowly cited patents. Indeed, few highly cited patents have been highly tweeted (Fig. 7).

Table 7 Patents most tweeted
Fig. 7
figure 7

Scatter plot between the number of patent citations received and the number of tweets linking to the patent (N = 839)

Discussion

This study represents the first attempt of studying patents documents linked from Twitter. The linking tweets (volume, type, and impact), linking users (productivity, type, profile) and linked patents (jurisdiction, subject, and impact) have been determined and characterized. The main findings are discussed below.

Volume of tweets linking to patents

The number of linking tweets is increasing since 2015, constituting a large-scale dataset for measuring purposes. The years 2020 and 2021 are remarkable with more than 30,000 linking tweets published in each of these years.

The increase of the overall volume of tweets over the years (in August 2013 were published around 500 million tweets per day; this figure has risen to around 867 million tweets in August 2022)Footnote 10 and the launch of the bots Dailypatent and uspatentbot (created at 2016 and 2017, respectively) seem to be among the main Twitter-related causes of the take-off of tweets embedding links to Google Patents.

Beyond the Twitter activity, the integration of Google Scholar with Google Patents in 2015 might have also influenced the rise of linking tweets. Given the importance of Google Scholar as a discovery tool (Delgado López-Cózar et al., 2019), the use of this academic search engine could facilitate the discovery of patents, which users later spread through Twitter. These results are also aligned with the findings obtained by Ouellette (2017), who surveyed 832 US academic and industry researchers finding that 43% of these respondents found patents through Google Scholar, being this academic search engine the third preferred method to find patents after Google Patents (50%) and the USPTO website (45%). Therefore, we can infer that there was a latent interest in disseminating patents (the existence of over 127,000 tweets linking full-text patents available on Google Patents in 7 years might evidence that interest), and that the Google Scholar/Google Patents combination played a key role in the process of social dissemination of patents on Twitter.

A slight slowdown in the generation of linking tweets is detected in 2021. Therefore, it should be checked if after the covid-19 pandemic the number of linking tweets could be reduced, which seems likely given the remarkable number of coronavirus-related patents among the most tweeted patents (Table 7).

Impact of tweets linking to patents

73.3% of all linking tweets (92,968) have obtained no engagement at all (zero likes, retweets, replies and quotes), while only the 1.6% (1,637) have obtained at least one interaction in each of the engagement metrics measured. This result is aligned with the user engagement found for 7,037,233 unique original scholarly tweets (Fang et al., 2022), where only the 2% of the tweets attained engagement in all the four types of user engagement.

The data prevalence for each of the engagement metrics achieves low percentages (see Table 2). Therefore, the results evidence a low impact, especially for original-type tweets. Reply-type tweets attain higher average engagements. For this reason, this type of tweets might be of great interest to locate and study conversations between Twitter users in which links to full-text patents are included as information resources.

The prevalence percentage of all the engagement metrics is increasing over the years. Hence, if the upward trend were to continue, the relevance of tweets to learn about the social dissemination of patents would increase. However, the skewed distribution found for all the engagement metrics (i.e. few tweets attract most of the engagement) could imply a high dependence on the overall impact of a few tweets linking few patents, which could have been mentioned for any reason. For example, the tweet 1313565051048128513, published in 2020, is the tweet most liked (7803 likes), retweeted (3033 retweets) and quoted (196 quotes) in the dataset, appearing in 127 linking tweets. This tweet embeds a URL to the patent US4656917A, a historic patent whose inventor is Van Halen, a famous guitarist who passed away that year. Likewise, the patent most tweeted in the dataset (CN112220919; embedded on 2490 tweets) describes the invention of a new coronavirus vaccine that contains graphene oxide, having generated extensive discussion and controversial both on and off Twitter, inside and outside the academic community.

The moderate correlations between the engagement metrics are partly caused by the effects of skewed distributions, that is, the high percentage of zero results, which should be avoided in overall descriptive analyses. Establishing a threshold (tweets with at least 10 likes received), the data shows a moderate significant positive correlation between the number of likes and retweets received (both reflecting a passive engagement; new content is not created). While the number of quotes received can be analyzed as an active engagement (i.e. new content is created), this metric is closer to the number of retweets received, which makes sense as the quote is a type of retweet (i.e. the users decide to quote when they are retweeting). Finally, the number of replies received (active engagement; new content is created) is only moderately correlated to the number of likes, reflecting a different engagement dimension in the corpus of tweets analyzed.

Given the higher engagement of the quote-type tweets, and the different behavior of the number of replies, the results indicate that metrics related to conversations could be especially relevant when it comes to better understanding the spread of patents on Twitter and developing impact metrics, while likes and retweets counts, despite being more numerous, could be less relevant when estimating impact.

If we take into account the total number of interactions received, likes (195,061) and retweets (67,941) are the most numerous metrics, while the number of replies (26,364) and quotes (8,360) are less used. These results are aligned with the engagement behavior found for scholarly tweets by Fang et al. (2022).

As regards the interaction rate, patents show low values: 1.54 likes per tweet and 0.54 retweets per tweet. These results are lower than those found by Fang et al. (2022) for research publications (2.95 likes per tweet and 1.91 retweets per tweet, respectively). While these results might evidence that patents are less engaged than scientific publications on Twitter, further research is deemed necessary to confirm this issue, as Fang et al. (2022) only managed publications indexed in Web of Science, and our contribution only analyzed patent IDs belonging to Google Patents. In any case, direct comparisons between patents and research publications should be discussed cautiously, as their communities of attention can overlap to some extent, but they are not necessarily the same.

Users linking to patents

Most users are sporadic (75.3% of all users published only one linking tweet), being responsible of a low percentage of all the published tweets (15.6%). Therefore, it is plausible to infer that the approach of these users to the dissemination of patents is limited and probably due to the specific topic of one patent or their relation to that patent (i.e. the user is the inventor), an aspect that should be studied in greater detail.

If we focus on the few highly performer users, most of them are individual accounts but the most productive are institutional accounts publishing as bots; around 50% of all linking tweets come from two bots. Although bots are not necessarily negative (they post many tweets, generate dissemination, and encourage conversation), this data confirms that the dissemination of patents on Twitter is fundamentally through automated accounts and not individual users who share patents to disseminate its contents, foster its dissemination or discuss about specific topics in which the patent becomes a relevant information resource in the conversation.

Other findings have revealed a remarkable instability of productive users (deleted or suspended accounts), an issue already detected for tweeted publications (Fang et al., 2020a, 2020b), and an increase of unique users in 2020 (from 4,530 users in 2019 to 9,829 users in 2020; see Fig. 5), which might be an effect of specific conversations around polemic topics (e.g., coronavirus vaccination). Future studies should also check whether the number of unique users linking to patents might decline after the pandemic.

Otherwise, the activity of the most productive users does not follow specific patterns. The number of followers, followings, total tweets published is quite different among these users. In other words, the users who tweet many patents do not have a defined social profile.

Volume of patents linked from Google patents

It was decided to combine all the different patent applications and granted patents registered under the same kind code as a unique patent ID. Even though there may be slight differences (e.g., publication date, claims) between these documents, they all refer to the same invention within the same jurisdiction. Therefore, the number of patents reported (86,417) should be understood at the level of the invention instead of the document.

While the combination of patent applications under the same kind code does not make it possible to calculate exact percentage values (the number of unique patent IDs is unknown), the percentage of patents tweeted is extremely low (results in Table 5 evidence this issue), given that Google Patents coverage is currently around 140 million documents (grants and applications). Taking this into account, the percentage of patents tweeted is around 0.06%, this value being an underrepresentation of the real value, which is estimated to be slightly higher.

Fang et al., (2020a, 2020b) reported altmetric data for “nearly 12.3 million Web of Science publications published between 2012 and 2018”, of which 34.01% had been mentioned on Twitter. Even though these percentages vary according to the Altmetrics aggregator, the temporal coverage of publications collected, the selected bibliographic database, and the promotion actions carried out by publishers, arguably, patents are less tweeted than publications.

The scarce number of patents tweeted along with their low impact might compromise the wide usage of social Patentometrics. A plausible reason is that researchers and publishers are interested in promoting their publications (Sugimoto et al., 2017). However, inventors do not follow this same rationale to promote patents, which could explain the large number of tweets coming from information services (many of them bots).

Patents most linked from twitter

The jurisdictions whose patents are full text indexed by Google Patents are those receiving most links from Twitter, especially USPTO patents. However, these results are biased due to the behavior of the accounts DailyPatent and uspatentbot, which jointly publish 53.3% of all linking tweets (Table 4), and which only tweet US patents.

The patent most tweeted comes from the Chinese Patent Office, being related to the covid-19. This issue also evidences the coronavirus effect on the dissemination of patents on Twitter, as 8 out of the 20 patents most tweeted are related to covid-19. This result reflects the importance of specific events in generating patent outreach on Twitter. Other variables such as the number of patent citations received (a signal of the relevance of the patent) do not correlate with the number of tweets linking the patent.

Otherwise, discrepancies in patent citation counts between Lens and Google Patents have been noticed for specific highly tweeted patents. A larger scale analysis should be carried out to check the correlation between these sources. As regards Google Scholar citation counts, the results reveal inconsistencies. Google Scholar computes citation counts for patents considering both patent citations and non-patent citations. However, the results obtained for highly tweeted patents show inconsistent results, which should also be checked. However, the unavailability of an API for Google Patents makes this task difficult.

Limitations

The results have shown a wide time lag between the appearance of Twitter and Google Patents (2006) and the appearance of the first tweet with a link to Google Patents (2015). In order to explain this late occurrence, the Internet Archive’s Wayback MachineFootnote 11 was used to check the operating of Google Patents since 2006 to 2015, discovering that other URLs were used as patent URL IDs in the early years of Google Patents, depending on the Google market, such as ‘google.com/patents/about?id = *’ or ‘google.es/patents/about?id = *’.

To check the effects of these URLs on the results, all tweets containing the seed “google.com/patents/about?id = *” were collected. As we observe in Table 8, only 278 tweets were found from 2009 to 2014. Considering that ‘google.com’ is the most important Google’s domain name, we estimate the volume of linking tweets very low.

Table 8 Number of tweets linking the old patent URL ID (google.com/patents/about?id = *)

All links using the old patent URL ID are currently broken and do not provide access to patent documents. In any case, the results reported in this work are limited to tweets from 2015 to 2021, as the URL seed analyzed was not active before 2015.

This fact reflects the limitation of using URL seeds (regular expressions) to collect linking tweets to patents. Moreover, future Google Patent website changes might create new URL IDs, making the data collection process complex. Still, this is the most effective method to collect the linking tweets because patents, unlike scientific publications, do not use standardized URL-based IDs, such as DOIs.

All the search queries performed have relied on the Academic Twitter API v2. Despite the service is quite efficient, recovery for all existing tweets for each query is not 100% guaranteed as “the Search API is focused on relevance and not completeness”, and some tweets (mainly spam, duplicate tweets or offensive tweets) may be missing from search results (Thelwall, 2015). Although this circumstance is explicitly pointed out by Twitter API v1Footnote 12 and this work uses API v2, it is likely that this same problem would occur. In any case, the loss of these tweets is unlikely to be problematic.

Bots have been identified via the Botometer application. Therefore, the results are limited to the accuracy of this tool. The identification of bots is a complex task as human users can tweet “like a bot”. Therefore, there is a margin of error for those accounts that do not define themselves as bots.

Finally, patents have been analyzed through Google Patents. However, other patent databases offer free online access to patent documents. Consequently, one same patent can be linked from Twitter via different URLs. Hence, all results reported should be limited to patents indexed Google Patents in particular, not to patent applications in general.

Future research

Even though this work has revealed general aspects about the dissemination and impact of patents on Twitter, the design of impact indicators at the patent level still requires a better understanding of the factors and variables involved in this communication process. Future lines of work in progress are indicated below:

Data inside the tweet

The analysis of the linking tweet text deems necessary to better understand the context in which the patent is being discussed or shared. Both quantitative (e.g. user mentions counts, hashtags counts, co-linked URLs counts) and qualitative (e.g. purpose of tweets, sentiment) should be carried out.

Data about the user

Each user linking to patents should be deeply described. For example, determining its role (inventor) or profession (industry, scholar, media, government, etc.) might provide new insights to better understand who and why tweet to patents. While pioneer studies aimed to characterize scholars on Twitter has been carried out (Costas et al., 2020; Mohammadi et al., 2018; Yu et al., 2019), there is a gap in the literature on industry researchers and users related to innovation on Twitter. In addition, given the predominant role of bots tweeting to patents, future studies should analyze the activity and impact of bots and humans separately.

Data about the engagement metrics

The engagement speed (e.g. the percentage of engagement received after 24 h since the publication of the linking tweet) and spread (e.g. the number and type of users engaging with the linking tweet, along with the languages used by these users and their origin) will allow a better understanding of the sensitivity and nature of the metrics, and therefore the design of more accurate impact indicators.

Data beyond Google patents

This work has analyzed one full text patent database. Future studies should consider other databases offering free full text access to patents (e.g. The Lens, Trea) to get a more comprehensive view of patents dissemination on Twitter, and other social media platforms (e.g. LinkedIn).

Conclusions

Twitter has been proved to be a relevant venue for disseminating patents and inventions in a large non-academic setting. This study has unraveled the volume of tweets linking to full-text patents, the impact of these linking tweets, the main Twitter users linking to patents as well as the patents most linked. These findings have allowed laying the foundations for social Patentometrics, in which patents are placed as linked/mentioned online resources in social media.

Even though more studies are needed to understand the mechanisms that regulate the dissemination and consumption of information related to patents on Twitter, this study has made it possible to determine that the existence of discovery tools (Google Scholar) and full-text online databases (Google Patents) were necessary to enhance the dissemination of patents.

The results obtained also allow obtaining metrics related to the interest on patents. Despite the metrics analyzed in this study are indirect and based on a first generation of metrics (engagement with the tweet instead of the patent itself), the use of second-generation metrics (clicks on links) as well as web usage data on patent visits/downloads will provide more evidence of the use of patents by different types of audience, including academic and industrial researchers, practitioners, and the broad public.