Abstract
This study attempts to analyze patents as cited/mentioned documents to better understand the interest, dissemination and engagement of these documents in social environments, laying the foundations for social media studies of patents (social Patentometrics).Particularly, this study aims to determine how patents are disseminated on Twitter by analyzing three elements: tweets linking to patents, users linking to patents, and patents linked from Twitter. To do this, all the tweets containing at least one link to a full-text patent available on Google Patents were collected and analyzed, yielding a total of 126,815 tweets (and 129,001 links) to 86,417 patents. The results evidence an increase of the number of linking tweets over the years, presumably due to the creation of a standardized patent URL ID and the integration of Google Patents and Google Scholar, which took place in 2015. The engagement achieved by these tweets is limited (80.2% of tweets did not attract likes) but increasing notably since 2018. Two super-publisher twitter bot accounts (dailypatent and uspatentbot) are responsible of 53.3% of all the linking tweets, while most accounts are sporadic users linking to patent as part of a conversation. The patents most tweeted are, by far, from United States (87.5% of all links to Google Patents), mainly due to the effect of the two super-publishers. The impact of patents in terms of the number of tweets linking to them is unrelated to their year of publication, status or number of patent citations received, while controversial and media topics might be more determinant factors. However, further research is needed to better understand the topics discussed around patents on Twitter, the users involved, and the metrics attained. Given the increasing number of linking users and linked patents, this study finds Twitter as a relevant source to measure patent-level metrics, shedding light on the impact and interest of patents by the broad public.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
The rise of the Altmetrics allowed measuring broader impact of scholarly publications (Adie, 2016; Holmberg, 2015; Sugimoto et al., 2017; Warren et al., 2017), expanding the cited publications to be analyzed (from journal articles to any scholarly output), the citing publications to be considered (from scholarly publications to non-scholarly publications), and the nature of the performance metrics available (from bibliographic citations to usage, dissemination, comments, discussion, rating or connectivity), leading to a new generation of research metrics (Orduña-Malea et al., 2016; Priem & Hemminger, 2010) led by social acts (Haustein et al., 2016a).
By embedding scholarly works or the online research objects representing them (e.g. the full-text article’s URL) on social applications and platforms (Haustein, Bowman and Costas, 2016a), meta-researchers were allowed to capture signals of the interest, dissemination and engagement of the research endeavor beyond the scholarly environment (Tahamtan & Bornmann, 2020). This way, the development of the Altmetrics field unraveled data prevalence differences between social media sources (Haustein et al., 2015; Thelwall, 2018), data aggregators (Zahedi, et al., 2015; Ortega, 2018a, 2018b, 2020), and data accumulation velocity differences (Fang & Costas, 2020), being those differences shaped by the characteristics of each discipline (Htoo & Na, 2017; Orduna-Malea & Delgado López-Cózar, 2019; Zahedi et al., 2014).
Patents have been scarcely studied within this social analytical framework. The role of patents in Altmetrics studies has been limited to being a source of citations (i.e. the citing publications) for the documents being monitored (i.e. the cited publications), in a similar way than the role played in Patentometrics classic studies, were references from patents to papers are analyzed (Hammarfelt, 2021). Earlier webometric studies on patents have also focused on URL mentions from patents to other online resources (Orduna‐Malea, et al., 2017; Font-Julián et al., 2022).
Altmetrics reflects an evolution from Scientometrics 1.0 (Fig. 1; mode A) to Scientometrics 2.0 or social Scientometrics (Fig. 1; mode B). However, Patentometrics 1.0 (Fig. 1; mode C) did not evolve into a social Patentometrics model (Fig. 1; mode D). This contribution intends precisely to lay the foundations for social Patentometrics studies.
The lack of social media metrics studies of patents might be due to the following considerations:
Purpose of patents
“A patent is a right granted to the owner of an invention that prevents others from making, using, importing, or selling the invention without the inventor’s permission” (Marley, 2014), limited to one specific jurisdiction during a limited period. Therefore, patents are legal documents, whose generation, consumption and impact are guided by dynamics different from those of the scholarly community.
Access to patents
While patents are made available to the public at large as part of the disclosure obligation of inventors (Graham and Hedge, 2015), most web-based patent databases exhibit lack of advance search features (Marley, 2014). Likewise, other databases are offered by the patent office where the patents were submitted to request protection, limiting its coverage to that jurisdiction, and preventing thus the analysis of family patents (Martínez, 2011). In other cases, web-based patent databases offer only online descriptive metadata, with no online access to the full-text document.
Interest on patents
As part of the disclosure obligation, inventors are required to accomplish a few conditions, which can vary slightly from one patent office to another. For example, patentees in the United States must satisfy three conditions (Ouellette, 2012): a written description (disclosing the technologic knowledge upon which the patent is based, and demonstrating that the patentee is in possession of the invention that is claimed), enablement (how to make and how to use the invention), and best mode (the patent must include “the best mode contemplated by the inventor or joint inventor of carrying out the invention”). This makes patents useful documents with technical solutions to problems, offering researchers both scientific and legal benefits for reading patents (Ouellette, 2017). However, the literature has also evidenced potential concerns related to the information offered. Patents might be perceived as hard to read and understand, vague, with extensive legal jargon included. In addition, there is a perception that reading patents might lead to increased liability for ‘willful’ patent infringement (Ouellette, 2012). These issues might show patents as less attractive documents for researchers (Lemley, 2008), making them underutilized scientific resources.
The subsequent emergence of open full-text patent databases accessible on the Web (each patent holds a URL) and covering documents from many patent offices around the world, enabled webometrics and Altmetrics studies to be carried out, allowing the social Patentometrics model (Fig. 1; mode D). These databases include patent search facilities (e.g. The Lens), patent search engines (e.g. Google Patents, Yandex Patents) and patent databases (e.g. Free Patents Online, Trea).
URLs linking to openly available full-text patents can be subsequently embedded on social media platforms (e.g. Twitter, LinkedIn), not only enhancing the dissemination of the patents but also allowing the capture of usage, interest and readership evidence, an aspect scarcely covered by the literature–basically through survey data (Ouellette, 2017), with no quantitative studies to date.
Among the currently available social media platforms, Twitter stands out due to its communicative nature and wide use. As of May, 2022, Twitter exhibits a high number of users (around 229 million monetizable daily active users) and tweets published (around 850 million tweets per day). Moreover, the existence of an Academic API and the availability of a wide variety of engagement metrics make Twitter a suitable platform to collect interest on patent publications.
Following in the model proposed by Fang et al. (2021), the Fig. 2 shows how Twitter can be employed to connect two landscapes: the innovation landscape and the Twitter landscape. Each online full-text patent holds a URL (research object) which can be embedded on one tweet. This tweet can be engaged by other users, who can like, retweet or reply to the original tweet. In addition, by clicking on the URL, users can be redirected to the patent. This model can collect three types of metrics: URL-based metrics (e.g. number of times a patent URL has been mentioned), tweet-based metrics (e.g. number of likes that a tweet mentioning patents has received), and patent-based metrics (e.g. number of visits that a patent has received from Twitter).
The main objective of this study is to disclose how patents are disseminated on Twitter. Specifically, this work focuses on analyzing three elements: tweets linking to patents (e.g. what volume of tweets is generated, what type of tweets are published, what impact do they generate); users linking to patents (e.g. what type of users link to patents, what is their activity on Twitter); and patents linked from Twitter (e.g. which patents are linked more frequently, which Patent Offices obtain a greater diffusion, which are the main subjects covered by the tweeted patents). This exploratory and descriptive research intends to lay the foundations for the future design of engagement metrics aimed at understanding the interest, dissemination, and engagement of patent documents in social environments.
To address that main objective, the following research questions are set:
-
RQ1. What is the volume of tweets linking to patents?
-
RQ2. What is the impact of tweets linking to patents?
-
RQ3. What type of Twitter users link patents?
-
RQ4. Which patents are most linked from Twitter?
To answer the research questions established above, the Google Patents database is used as case study. This decision is based on its wide global coverage of patents, its ease of use, and the generation of friendly URLs for each patent, which have made Google Patents one of the main patent discovery tools for researchers (Ouellete, 2015).
Research background
Google patents: a global online full-text patent discovery tool
Google Patents is a search engine and discovering tool launched in December 14, 2006, that indexes full-text granted patents and patents applications. As of May 2022, Google Patents covers over 140 million patent publications from 105 patent offices around the globe. Full-text documents are indexed from 22 patent offices.
Google Patents includes advanced search engines, global litigation information and a Prior Art Finder Tool, which includes a copy of the “technical documents and books indexed in Google Scholar and Google Books, as well as documents included in the Prior Art Archive”.Footnote 1 For each patent, full-text, figures, the original PDF version, metadata, and citations are included. Patents with only non-English text have been machine-translated to English.Footnote 2
While Google Patents is currently used as data source for Patentometrics studies and literature reviews (e.g., Narayanankutty, 2019, 2022), the literature aimed at describing and characterizing its features from an informational perspective has been limited. Noruzi and Abdekhoda (2014) and Marley (2014) described its search functionalities, and Moskovkin et al. (2012) showed Google Patents as a patent-metric tool useful to analyze the patent activity of world-wide leading innovation companies. Other works have explored Google Patents as a tool to find patent citations to scholarly works (Kousha & Thelwall, 2015) or URL citations to university websites (Orduna‐Malea, Thelwall and Kousha, 2017).
Twitter: a social platform disseminating contents
Twitter is a real-time microblogging and social networking platform, launched in July 2006. Users can post brief plain text messages referred to as tweets, which can be liked or retweeted (with or without a quote) by other users. The platform collects metrics both at the user-level (e.g. number of followers, followings, tweets posted) and at the tweet-level (e.g. number of likes or retweets received) along with a wide variety of descriptive contextual metrics (e.g. tweet language, user location, etc.), making Twitter a potential tool to obtain scholarly metrics (Haustein, 2019).
Since the origin of Altmetrics, Twitter mentions have become one of the most important Altmetric events for scientific publications (Sugimoto et al., 2017), being the number of likes an important factor for measuring the social media activity of users around science (Díaz-Faes et al., 2019), although limitations such as the stability of twitter data (Fang et al., 2020a, 2020b) should be taken into account. Interactions between users and tweets around scholarly publications have been also analyzed (Friedrich et al., 2015; Haustein et al., 2015; Didegah et al., 2018; Hassan et al., 2021; Costas et al., 2021; Fang et al., 2021), including automatically generated content (Haustein et al., 2016b).
Despite the extensive literature focused on the scholarly use of Twitter, however, the dissemination and interest of patents in this social networking platform has not been studied to date.
Method
Tweets data collection
A python script was employed via the Academic Twitter API v2Footnote 3 (full-archive search endpoint) to collect all tweets containing a URL to a full-text patent available on Google Patents, using the following ULR seed: ‘patents.google.com/patent/*’. The data collection expanded from 26 March 2006 (the birth of Twitter) to 31 December 2021. Data extraction was carried out by 23 February 2022. Original, reply and quoted tweets were considered, while retweets were excluded as these tweets are re-publications that can distort the results obtained.
For each tweet, the following parameters were captured: author id, username, tweet id, conversation id, public metrics (number of retweets, replies, likes, and quotes received), Twitter type (original, reply, quoted), and the tweet text. Results were obtained in JSON format, which were subsequently transformed into CSV files via OpenRefine to be further analyzed. This process yielded 126,815 tweets from 26,106 users.
Accounts data collection
A second python script was built (Users by ID endpoint) to collect users’ descriptive data. For each Twitter user, the following data were captured: username, created at, lists, tweets, followers, and followings. Data was captured by 17 May 2022.
To characterize users’ behavior, the Botometer APIFootnote 4 was employed. Botometer is an application that scores Twitter users from 0 to 5 considering six variables,Footnote 5 and using language-independence features (Sayyadiharikandeh, 2020). Scores near 5 reflect a bot-like behavior, while scores near 0 reflect human-like behavior. To do this, the most productive Twitter users (those who published at least 10 tweets linking to patents) were analyzed. While this threshold might be perceived as a subjective decision, these users (536) are responsible of the 70.1% of all tweets collected, constituting a representative sample. Data was collected by 24 May 2022, using the display universal score.
Patents data collection
All URLs embedded in each of the 126,815 linking tweets were extracted, comprising 146,641 links, both to patents and other online resources. As Google Patents uses URL aliases,Footnote 6 a data cleansing process was needed to identify the patents tweeted, by extracting the patent ID embedded in each Google Patent URL, which includes the country/Area code (patent office), number constitution, and kind code. All the type codes related to the same patent number constitution were combined to facilitate the analysis. This process yielded 86,417 patents.
The Patent API v.1.2.7 offered by The LensFootnote 7 was used to collect descriptive data related to each patent. The following parameters were considered: publication date, patent type, patent language, patent status, and patent citations. Data were collected by 27 May 2022.
Results
RQ1. What is the volume of tweets linking to patents?
The first tweet containing a link to a Google Patents’ full-text patent appeared in 2015, date that coincides with a significant update of Google Patents, including its integration with Google Scholar.Footnote 8 Since then, 126,815 tweets have been published (Fig. 3), mainly original tweets (75.6%) and replies (23%). The year 2018 marks a milestone (23,888 tweets published), date that also coincides with another Google Patent’s update, in this case with the addition of global litigation information, via a partnership with Darts-IP,Footnote 9 currently part of Clarivate.
RQ2. What is the impact of tweets linking to patents?
The engagement received by tweets differs according to the type of tweet. Quotes and replies receive in average (arithmetic mean) more likes and retweets than original tweets (Table 1), which might imply that tweets around conversation generate more interest on users. Due to the skewed data, geometric means are also offered. In this case, we observe that original tweets improve their engagement notably. However, as the geometric mean operates only with values greater than 0 (i.e. eliminating all tweets with no engagement), the results can be misleading and should be taken cautiously and jointly with the arithmetic means obtained.
The prevalence of the engagement metrics is limited (Table 2). Only 19.8% of all tweets linking to full-text patents have attracted at least one like. Likewise, 12.8% of tweets have received at least one reply, 9.1% of tweets have attracted at least one retweet, and 3.4% of tweets have received at least one quote. These percentages, however, have annually increased since 2017.
All the engagement metrics measured show skewed distributions. The median values for all metrics and all years are zero. This way, few tweets achieve outstanding values (e.g. only 265 tweets attract more than 100 likes; only 105 tweets attract more than 100 retweets; and only 2 tweets attract more than 100 replies or quotes), while most tweets attract low interest: 12,819 tweets (50.9% of all tweets receiving likes) attract only one like.
The correlation (Spearman) between the engagement metrics is statistically significant, but moderate. Only the number of likes and retweets achieve a moderate positive correlation (Rs = 0.54; p-value: < 0.0001; α > 0.05) (Table 3a). When data is restricted to tweets with at least 10 likes received (Table 3b), results improve slightly, especially the correlation of quotes with the remaining metrics.
A plausible reason for the low correlation values obtained is the low prevalence of the engagement metrics, previously observed. The high number of tweets with 0 or 1 likes/retweets/replies/quotes distorts the correlations achieved. However, when the number of likes received achieves a threshold (around 10), the correlation between the numbers of likes and retweets increases (Fig. 4). This threshold effect is hardly noticeable for the number of quotes and replies, which seem to reflect a different engagement dimension.
RQ3. What type of twitter users link patents?
The 126,815 linking tweets have been published by 26,106 unique users. The number of unique users per year has increased notable since 2020. In 2021, a total of 10,006 unique users published 36,175 tweets linking to a full-text patent available on Google Patents (Fig. 5).
The distribution of linking tweets published per user is also quite skewed: 2 authors have published at least 100 tweets (high performers), while 19,656 users (75.3% of all users) have published only one linking tweet (sporadic users); 15.6% of all tweets come from sporadic users.
Two Twitter accounts (DailyPatent and uspatentbot) jointly publish 53.3% of all tweets (33,864 and 33,775 tweets, respectively), constituting the most influential users. These accounts are highly productive, do not follow other users, and most of their tweets published link to Google Patents (68.6% and 92.9%, respectively).
Considering the most productive users (Table 4), no specific characteristics regarding their behavior can be distinguished. We can find productive users with high number of followers and followings (e.g. tatzanx), productive users with high number of tweets published but few followers (e.g. DinahParums), productive users with low number of both followers and followings (e.g. covidventilator), or unproductive users with high number of followings (e.g. PPAtrading). Patents_bot and patentsexpiring are outlier accounts: all their published tweets link patents. In other cases, the linking patents amount for a low percentage of the total tweets published (e.g. COILPOD).
A deep analysis of 536 moderate and highly performers (those publishing at least 10 tweets linking to Google Patents) reveals an obsolescence of users (32 accounts have been suspended or eliminated at the time of the data analysis), a remarkable presence of bots (or human publishing like a bot), with 10% of those 504 active accounts exhibiting a Botometer display universal score higher than 4, of which 32 are self-declared bots (Fig. 6), and a high presence of individual (445) over organizational (59) accounts.
RQ4. Which patents are most linked from twitter?
The United States Patent and Trademark Office (UPSTO) is the jurisdiction receiving most links from Twitter (112,949 links to 81,156 patents), followed at a great distance by the World Intellectual Property Organization (WIPO), the China National Intellectual Property Administration (CNIPA), European Patent Office (EPO) and the Japan Patent Office (JPO). The remaining jurisdictions are linked in a minority way. Otherwise, it is noteworthy the low number of links to patents from Denmark (8), Austria (5), Finland (3), Brazil (2) and Belgium (0) jurisdictions, whose patents are indexed full-text on Google Patents.
Table 5 includes the number of patents tweeted and the number of links to unique patents tweeted per patent office. Even considering that each Patent measured includes all the patent applications and granted patents related the same patent ID, the number of patents linked from Twitter constitutes a small percentage of the patents covered by Google Patents for each jurisdiction.
Human necessities (383 patents) and Physics (156 patents) are the subjects most covered by the patents most tweeted (those patents tweeted at least 5 times; N = 818). The presence of the remaining subjects is low (Table 6).
A Chinese written patent application (CN112220919), related to covid-19 (with CPC code “A” [human necessities] assigned), published in 2021 and still pending, is the full-text patent most tweeted. This fact evidences that neither the time of publication, nor patent citations count or status are decisive variables in obtaining a higher dissemination on Twitter. Table 7 delve into this issue by displaying the top 20 tweeted patents, including the status, the publication date, the first CPC code, the number of patent citations (both from Lens and Google Patents), and the total number of citations (from Google Scholar). As we observe, highly tweeted patents include old/recent, active/expired or highly/lowly cited patents. Indeed, few highly cited patents have been highly tweeted (Fig. 7).
Discussion
This study represents the first attempt of studying patents documents linked from Twitter. The linking tweets (volume, type, and impact), linking users (productivity, type, profile) and linked patents (jurisdiction, subject, and impact) have been determined and characterized. The main findings are discussed below.
Volume of tweets linking to patents
The number of linking tweets is increasing since 2015, constituting a large-scale dataset for measuring purposes. The years 2020 and 2021 are remarkable with more than 30,000 linking tweets published in each of these years.
The increase of the overall volume of tweets over the years (in August 2013 were published around 500 million tweets per day; this figure has risen to around 867 million tweets in August 2022)Footnote 10 and the launch of the bots Dailypatent and uspatentbot (created at 2016 and 2017, respectively) seem to be among the main Twitter-related causes of the take-off of tweets embedding links to Google Patents.
Beyond the Twitter activity, the integration of Google Scholar with Google Patents in 2015 might have also influenced the rise of linking tweets. Given the importance of Google Scholar as a discovery tool (Delgado López-Cózar et al., 2019), the use of this academic search engine could facilitate the discovery of patents, which users later spread through Twitter. These results are also aligned with the findings obtained by Ouellette (2017), who surveyed 832 US academic and industry researchers finding that 43% of these respondents found patents through Google Scholar, being this academic search engine the third preferred method to find patents after Google Patents (50%) and the USPTO website (45%). Therefore, we can infer that there was a latent interest in disseminating patents (the existence of over 127,000 tweets linking full-text patents available on Google Patents in 7 years might evidence that interest), and that the Google Scholar/Google Patents combination played a key role in the process of social dissemination of patents on Twitter.
A slight slowdown in the generation of linking tweets is detected in 2021. Therefore, it should be checked if after the covid-19 pandemic the number of linking tweets could be reduced, which seems likely given the remarkable number of coronavirus-related patents among the most tweeted patents (Table 7).
Impact of tweets linking to patents
73.3% of all linking tweets (92,968) have obtained no engagement at all (zero likes, retweets, replies and quotes), while only the 1.6% (1,637) have obtained at least one interaction in each of the engagement metrics measured. This result is aligned with the user engagement found for 7,037,233 unique original scholarly tweets (Fang et al., 2022), where only the 2% of the tweets attained engagement in all the four types of user engagement.
The data prevalence for each of the engagement metrics achieves low percentages (see Table 2). Therefore, the results evidence a low impact, especially for original-type tweets. Reply-type tweets attain higher average engagements. For this reason, this type of tweets might be of great interest to locate and study conversations between Twitter users in which links to full-text patents are included as information resources.
The prevalence percentage of all the engagement metrics is increasing over the years. Hence, if the upward trend were to continue, the relevance of tweets to learn about the social dissemination of patents would increase. However, the skewed distribution found for all the engagement metrics (i.e. few tweets attract most of the engagement) could imply a high dependence on the overall impact of a few tweets linking few patents, which could have been mentioned for any reason. For example, the tweet 1313565051048128513, published in 2020, is the tweet most liked (7803 likes), retweeted (3033 retweets) and quoted (196 quotes) in the dataset, appearing in 127 linking tweets. This tweet embeds a URL to the patent US4656917A, a historic patent whose inventor is Van Halen, a famous guitarist who passed away that year. Likewise, the patent most tweeted in the dataset (CN112220919; embedded on 2490 tweets) describes the invention of a new coronavirus vaccine that contains graphene oxide, having generated extensive discussion and controversial both on and off Twitter, inside and outside the academic community.
The moderate correlations between the engagement metrics are partly caused by the effects of skewed distributions, that is, the high percentage of zero results, which should be avoided in overall descriptive analyses. Establishing a threshold (tweets with at least 10 likes received), the data shows a moderate significant positive correlation between the number of likes and retweets received (both reflecting a passive engagement; new content is not created). While the number of quotes received can be analyzed as an active engagement (i.e. new content is created), this metric is closer to the number of retweets received, which makes sense as the quote is a type of retweet (i.e. the users decide to quote when they are retweeting). Finally, the number of replies received (active engagement; new content is created) is only moderately correlated to the number of likes, reflecting a different engagement dimension in the corpus of tweets analyzed.
Given the higher engagement of the quote-type tweets, and the different behavior of the number of replies, the results indicate that metrics related to conversations could be especially relevant when it comes to better understanding the spread of patents on Twitter and developing impact metrics, while likes and retweets counts, despite being more numerous, could be less relevant when estimating impact.
If we take into account the total number of interactions received, likes (195,061) and retweets (67,941) are the most numerous metrics, while the number of replies (26,364) and quotes (8,360) are less used. These results are aligned with the engagement behavior found for scholarly tweets by Fang et al. (2022).
As regards the interaction rate, patents show low values: 1.54 likes per tweet and 0.54 retweets per tweet. These results are lower than those found by Fang et al. (2022) for research publications (2.95 likes per tweet and 1.91 retweets per tweet, respectively). While these results might evidence that patents are less engaged than scientific publications on Twitter, further research is deemed necessary to confirm this issue, as Fang et al. (2022) only managed publications indexed in Web of Science, and our contribution only analyzed patent IDs belonging to Google Patents. In any case, direct comparisons between patents and research publications should be discussed cautiously, as their communities of attention can overlap to some extent, but they are not necessarily the same.
Users linking to patents
Most users are sporadic (75.3% of all users published only one linking tweet), being responsible of a low percentage of all the published tweets (15.6%). Therefore, it is plausible to infer that the approach of these users to the dissemination of patents is limited and probably due to the specific topic of one patent or their relation to that patent (i.e. the user is the inventor), an aspect that should be studied in greater detail.
If we focus on the few highly performer users, most of them are individual accounts but the most productive are institutional accounts publishing as bots; around 50% of all linking tweets come from two bots. Although bots are not necessarily negative (they post many tweets, generate dissemination, and encourage conversation), this data confirms that the dissemination of patents on Twitter is fundamentally through automated accounts and not individual users who share patents to disseminate its contents, foster its dissemination or discuss about specific topics in which the patent becomes a relevant information resource in the conversation.
Other findings have revealed a remarkable instability of productive users (deleted or suspended accounts), an issue already detected for tweeted publications (Fang et al., 2020a, 2020b), and an increase of unique users in 2020 (from 4,530 users in 2019 to 9,829 users in 2020; see Fig. 5), which might be an effect of specific conversations around polemic topics (e.g., coronavirus vaccination). Future studies should also check whether the number of unique users linking to patents might decline after the pandemic.
Otherwise, the activity of the most productive users does not follow specific patterns. The number of followers, followings, total tweets published is quite different among these users. In other words, the users who tweet many patents do not have a defined social profile.
Volume of patents linked from Google patents
It was decided to combine all the different patent applications and granted patents registered under the same kind code as a unique patent ID. Even though there may be slight differences (e.g., publication date, claims) between these documents, they all refer to the same invention within the same jurisdiction. Therefore, the number of patents reported (86,417) should be understood at the level of the invention instead of the document.
While the combination of patent applications under the same kind code does not make it possible to calculate exact percentage values (the number of unique patent IDs is unknown), the percentage of patents tweeted is extremely low (results in Table 5 evidence this issue), given that Google Patents coverage is currently around 140 million documents (grants and applications). Taking this into account, the percentage of patents tweeted is around 0.06%, this value being an underrepresentation of the real value, which is estimated to be slightly higher.
Fang et al., (2020a, 2020b) reported altmetric data for “nearly 12.3 million Web of Science publications published between 2012 and 2018”, of which 34.01% had been mentioned on Twitter. Even though these percentages vary according to the Altmetrics aggregator, the temporal coverage of publications collected, the selected bibliographic database, and the promotion actions carried out by publishers, arguably, patents are less tweeted than publications.
The scarce number of patents tweeted along with their low impact might compromise the wide usage of social Patentometrics. A plausible reason is that researchers and publishers are interested in promoting their publications (Sugimoto et al., 2017). However, inventors do not follow this same rationale to promote patents, which could explain the large number of tweets coming from information services (many of them bots).
Patents most linked from twitter
The jurisdictions whose patents are full text indexed by Google Patents are those receiving most links from Twitter, especially USPTO patents. However, these results are biased due to the behavior of the accounts DailyPatent and uspatentbot, which jointly publish 53.3% of all linking tweets (Table 4), and which only tweet US patents.
The patent most tweeted comes from the Chinese Patent Office, being related to the covid-19. This issue also evidences the coronavirus effect on the dissemination of patents on Twitter, as 8 out of the 20 patents most tweeted are related to covid-19. This result reflects the importance of specific events in generating patent outreach on Twitter. Other variables such as the number of patent citations received (a signal of the relevance of the patent) do not correlate with the number of tweets linking the patent.
Otherwise, discrepancies in patent citation counts between Lens and Google Patents have been noticed for specific highly tweeted patents. A larger scale analysis should be carried out to check the correlation between these sources. As regards Google Scholar citation counts, the results reveal inconsistencies. Google Scholar computes citation counts for patents considering both patent citations and non-patent citations. However, the results obtained for highly tweeted patents show inconsistent results, which should also be checked. However, the unavailability of an API for Google Patents makes this task difficult.
Limitations
The results have shown a wide time lag between the appearance of Twitter and Google Patents (2006) and the appearance of the first tweet with a link to Google Patents (2015). In order to explain this late occurrence, the Internet Archive’s Wayback MachineFootnote 11 was used to check the operating of Google Patents since 2006 to 2015, discovering that other URLs were used as patent URL IDs in the early years of Google Patents, depending on the Google market, such as ‘google.com/patents/about?id = *’ or ‘google.es/patents/about?id = *’.
To check the effects of these URLs on the results, all tweets containing the seed “google.com/patents/about?id = *” were collected. As we observe in Table 8, only 278 tweets were found from 2009 to 2014. Considering that ‘google.com’ is the most important Google’s domain name, we estimate the volume of linking tweets very low.
All links using the old patent URL ID are currently broken and do not provide access to patent documents. In any case, the results reported in this work are limited to tweets from 2015 to 2021, as the URL seed analyzed was not active before 2015.
This fact reflects the limitation of using URL seeds (regular expressions) to collect linking tweets to patents. Moreover, future Google Patent website changes might create new URL IDs, making the data collection process complex. Still, this is the most effective method to collect the linking tweets because patents, unlike scientific publications, do not use standardized URL-based IDs, such as DOIs.
All the search queries performed have relied on the Academic Twitter API v2. Despite the service is quite efficient, recovery for all existing tweets for each query is not 100% guaranteed as “the Search API is focused on relevance and not completeness”, and some tweets (mainly spam, duplicate tweets or offensive tweets) may be missing from search results (Thelwall, 2015). Although this circumstance is explicitly pointed out by Twitter API v1Footnote 12 and this work uses API v2, it is likely that this same problem would occur. In any case, the loss of these tweets is unlikely to be problematic.
Bots have been identified via the Botometer application. Therefore, the results are limited to the accuracy of this tool. The identification of bots is a complex task as human users can tweet “like a bot”. Therefore, there is a margin of error for those accounts that do not define themselves as bots.
Finally, patents have been analyzed through Google Patents. However, other patent databases offer free online access to patent documents. Consequently, one same patent can be linked from Twitter via different URLs. Hence, all results reported should be limited to patents indexed Google Patents in particular, not to patent applications in general.
Future research
Even though this work has revealed general aspects about the dissemination and impact of patents on Twitter, the design of impact indicators at the patent level still requires a better understanding of the factors and variables involved in this communication process. Future lines of work in progress are indicated below:
Data inside the tweet
The analysis of the linking tweet text deems necessary to better understand the context in which the patent is being discussed or shared. Both quantitative (e.g. user mentions counts, hashtags counts, co-linked URLs counts) and qualitative (e.g. purpose of tweets, sentiment) should be carried out.
Data about the user
Each user linking to patents should be deeply described. For example, determining its role (inventor) or profession (industry, scholar, media, government, etc.) might provide new insights to better understand who and why tweet to patents. While pioneer studies aimed to characterize scholars on Twitter has been carried out (Costas et al., 2020; Mohammadi et al., 2018; Yu et al., 2019), there is a gap in the literature on industry researchers and users related to innovation on Twitter. In addition, given the predominant role of bots tweeting to patents, future studies should analyze the activity and impact of bots and humans separately.
Data about the engagement metrics
The engagement speed (e.g. the percentage of engagement received after 24 h since the publication of the linking tweet) and spread (e.g. the number and type of users engaging with the linking tweet, along with the languages used by these users and their origin) will allow a better understanding of the sensitivity and nature of the metrics, and therefore the design of more accurate impact indicators.
Data beyond Google patents
This work has analyzed one full text patent database. Future studies should consider other databases offering free full text access to patents (e.g. The Lens, Trea) to get a more comprehensive view of patents dissemination on Twitter, and other social media platforms (e.g. LinkedIn).
Conclusions
Twitter has been proved to be a relevant venue for disseminating patents and inventions in a large non-academic setting. This study has unraveled the volume of tweets linking to full-text patents, the impact of these linking tweets, the main Twitter users linking to patents as well as the patents most linked. These findings have allowed laying the foundations for social Patentometrics, in which patents are placed as linked/mentioned online resources in social media.
Even though more studies are needed to understand the mechanisms that regulate the dissemination and consumption of information related to patents on Twitter, this study has made it possible to determine that the existence of discovery tools (Google Scholar) and full-text online databases (Google Patents) were necessary to enhance the dissemination of patents.
The results obtained also allow obtaining metrics related to the interest on patents. Despite the metrics analyzed in this study are indirect and based on a first generation of metrics (engagement with the tweet instead of the patent itself), the use of second-generation metrics (clicks on links) as well as web usage data on patent visits/downloads will provide more evidence of the use of patents by different types of audience, including academic and industrial researchers, practitioners, and the broad public.
Notes
fake_follower: bots purchased to increase follower counts; self_declared: bots from botwiki.org; astroturf: manually labeled political bots and accounts involved in follow trains that systematically delete content; spammer: accounts labeled as spambots from several datasets; financial: bots that post using cashtags; other: miscellaneous other bots obtained from manual annotation, user feedback, etc.
The following URL were found: patents.google.com, google.co.uk/patents; google.co.ug/patents google.com.ar/patents, google.com.au/patents; google.com.br/patents; google.com.gi/patents; google.com.na/patents; google.com.pg/patents; google.com.tr/patents; google.de/patents; google.es/patents; google.ch/patents; google.ca/patents.
References
Adie, E. (2016). The rise of altmetrics. In A. Tattersall (Ed.), Altmetrics: A practical guide for librarians, researchers and academics (pp. 67–82). Facet Publishing.
Costas, R., Mongeon, P., Ferreira, M. R., Honk, J., & Franssen, T. (2020). Large-scale identification and characterization of scholars on Twitter. Quantitative Science Studies, 1(2), 771–791. https://doi.org/10.1162/qss_a_00047
Costas, R., de Rijcke, S., & Marres, N. (2021). “Heterogeneous couplings”: Operationalizing network perspectives to study science-society interactions through social media metrics. Journal of the Association for Information Science and Technology, 72(5), 595–610. https://doi.org/10.1002/asi.24427
Delgado López-Cózar, E., Orduña-Malea, E., & Martín-Martín, A. (2019). Google Scholar as a data source for research assessment. In W. Glänzel, H. F. Moed, U. Schmod, & M. Thelwall (Eds.), Springer handbook of science and technology indicators (pp. 95–127). Springer.
Díaz-Faes, A. A., Bowman, T. D., & Costas, R. (2019). Towards a second generation of ‘social media metrics’: Characterizing Twitter communities of attention around science. PLoS ONE, 14(5), e0216408. https://doi.org/10.1371/journal.pone.0216408
Didegah, F., Mejlgaard, N., & Sørensen, M. P. (2018). Investigating the quality of interactions and public engagement around scientific papers on twitter. Journal of Informetrics, 12(3), 960–971. https://doi.org/10.1016/j.joi.2018.08.002
Fang, Z., & Costas, R. (2020). Studying the accumulation velocity of altmetric data tracked by Altmetric. com. Scientometrics, 123(2), 1077–1101. https://doi.org/10.1007/s11192-020-03405-9
Fang, Z., Costas, R., Tian, W., Wang, X., & Wouters, P. (2020a). An extensive analysis of the presence of altmetric data for Web of Science publications across subject fields and research topics. Scientometrics, 124(3), 2519–2549. https://doi.org/10.1007/s11192-020-03564-9
Fang, Z., Dudek, J., & Costas, R. (2020b). The stability of twitter metrics: A study on unavailable twitter mentions of scientific publications. Journal of the Association for Information Science and Technology, 71(12), 1455–1469. https://doi.org/10.1002/asi.24344
Fang, Z., Costas, R., Tian, W., Wang, X., & Wouters, P. (2021). How is science clicked on Twitter? Click metrics for Bitly short links to scientific publications. Journal of the Association for Information Science and Technology, 72(7), 918–932. https://doi.org/10.1002/asi.24458
Fang, Z., Costas, R., & Wouters, P. (2022). User engagement with scholarly tweets of scientific papers: A large-scale and cross-disciplinary analysis. Scientometrics, 127(8), 4523–4546. https://doi.org/10.1007/s11192-022-04468-6
Font-Julián, C. I., Ontalba-Ruipérez, J. A., Orduña-Malea, E., & Thelwall, M. (2022). Which types of online resource support US patent claims? Journal of Informetrics, 16(1), 1–14. https://doi.org/10.1016/j.joi.2021.101247
Friedrich, N., Bowman, T. D., Stock, W. G., & Haustein, S. (2015). Adapting sentiment analysis for tweets linking to scientific papers. In A. Ali Salah, Y. Tonta, A.A.A., Salah, C. Sugimoto, & U. Al (Eds.). Proceedings of the 15th International Society of Scientometrics and Informetrics Conference (pp. 107–108) https://www.issi-society.org/proceedings/issi_2015/0107.pdf
Graham, S., & Hegde, D. (2015). Disclosing patents’ secrets. Science, 347(6219), 236–237. https://doi.org/10.1126/science.1262080
Hammarfelt, B. (2021). Linking science to technology: The “patent paper citation” and the rise of patentometrics in the 1980s. Journal of Documentation, 77(6), 1413–1429. https://doi.org/10.1108/JD-12-2020-0218
Hassan, S.-U., Saleem, A., Soroya, S. H., Safder, I., Iqbal, S., Jamil, S., Bukhari, F., Aljohani, N. R., & Nawaz, R. (2021). Sentiment analysis of tweets through Altmetrics: A machine learning approach. Journal of Information Science, 47(6), 712–726. https://doi.org/10.1177/0165551520930917
Haustein, S., Costas, R., & Larivière, V. (2015). Characterizing social media metrics of scholarly papers: The effect of document properties and collaboration patterns. PLoS ONE, 10(3), e0120495. https://doi.org/10.1371/journal.pone.0120495
Haustein, S., Bowman, T. D., & Costas, R. (2016a). Interpreting “altmetrics”: Viewing acts on social media through the lens of citation and social theories. In C. R. Sugimoto (Ed.), Theories of informetrics and scholarly communication: A festschrift in honor of Blaise Cronin (pp. 372–405). De Gruyter. https://doi.org/10.1515/9783110308464-022
Haustein, S., Bowman, T. D., Holmberg, K., Tsou, A., Sugimoto, C. R., & Larivière, V. (2016b). Tweets as impact indicators: Examining the implications of automated ‘bot’ accounts on Twitter. Journal of the Association for Information Science and Technology, 67(1), 232–238. https://doi.org/10.1002/asi.23456
Holmberg, K. J. (2015). Altmetrics for information professionals: Past, present and future. Chandos Publishing.
Haustein, S. (2019). Scholarly twitter metrics. In W. Glänzel, F. H. Moed, U. Schmoch, & M. Thelwall (Eds.), Springer handbook of science and technology indicators (pp. 729–760). Springer. https://doi.org/10.1007/978-3-030-02511-3_28
Htoo, T. H. H., & Na, J. C. (2017). Disciplinary differences in altmetrics for social sciences. Online Information Review, 41(2), 235–251. https://doi.org/10.1108/OIR-12-2015-0386
Kousha, K., & Thelwall, M. (2015). Patent citation analysis with Google. Journal of the Association for Information Science and Technology, 68(1), 48–61. https://doi.org/10.1002/asi.23608
Lemley, M. A. (2008). Ignoring Patents. Michigan State Law Review, 2008(1), 19–34.
Marley, M. (2014). Full-text patent searching on free websites: Tools, tips and tricks. Business Information Review, 31(4), 226–236. https://doi.org/10.1177/0266382114564265
Martínez, C. (2011). Patent families: When do different definitions really matter? Scientometrics, 86(1), 39–63. https://doi.org/10.1007/s11192-010-0251-3
Mohammadi, E., Thelwall, M., Kwasny, M., & Holmes, K. L. (2018). Academic information on twitter: A user survey. PLoS ONE, 13(5), e0197265. https://doi.org/10.1371/journal.pone.0197265
Moskovkin, V. M., Shigorina, N. A., & Popov, D. (2012). The possibility of using the Google Patents search tool in patentometric analysis (based on the example of the world’s largest innovative companies). Scientific and Technical Information Processing, 39(2), 107–112. https://doi.org/10.3103/S0147688212020086
Narayanankutty, A. (2019). PI3K/Akt/mTOR pathway as a therapeutic target for colorectal cancer: A review of preclinical and clinical evidence. Current Drug Targets, 20(12), 1217–1226. https://doi.org/10.2174/1389450120666190618123846
Narayanankutty, A. (2022). Pharmacological potentials and nutritional values of tropical and subtropical fruits of India: Emphasis on their anticancer bioactive components. Recent Patents on Anti-Cancer Drug Discovery, 17(2), 124–135. https://doi.org/10.2174/1574892816666211130165200
Noruzi, A., & Abdekhoda, M. (2014). Google Patents: The global patent search engine. Webology, 11(1). https://www.webology.org/2014/v11n1/a122.pdf
Ouellette, L. (2012). Do Patents Disclose Useful Information. Harvard Journal of Law & Technology, 25(2), 545–608.
Ouellette, L. (2017). Who reads patents? Nature Biotechnology, 35(5), 421–424. https://doi.org/10.1038/nbt.3864
Orduña-Malea, E., Martín-Martín, A., & Delgado-López-Cózar, E. (2016). The next bibliometrics: ALMetrics (Author Level Metrics) and the multiple faces of author impact. Profesional De La Información, 25(3), 485–496. https://doi.org/10.3145/epi.2016.may.18
Orduna-Malea, E., Thelwall, M., & Kousha, K. (2017). Web citations in patents: Evidence of technological impact? Journal of the Association for Information Science and Technology, 68(8), 1967–1974. https://doi.org/10.1002/asi.23821
Ortega, J. L. (2018a). Disciplinary differences of the impact of altmetric. FEMS Microbiology Letters, 365(7), fny049. https://doi.org/10.1093/femsle/fny049
Ortega, J. L. (2018b). Reliability and accuracy of altmetric providers: a comparison among Altmetric. com PlumX and crossref event data. Scientometrics, 116(3), 2123–2138. https://doi.org/10.1007/s11192-018-2838-z
Orduna-Malea, E., & Delgado López-Cózar, E. (2019). Demography of Altmetrics under the light of Dimensions: Locations, institutions, journals, disciplines and funding bodies in the global research framework. Journal of Altmetrics, 2(1), 1–18. https://doi.org/10.29024/joa.13
Ortega, J. L. (2020). Altmetrics data providers: A metaanalysis review of the coverage of metrics and publication. Profesional De La Información, 29(1), e290107. https://doi.org/10.3145/epi.2020.ene.07
Priem, J., & Hemminger, B. H. (2010). Scientometrics 2.0: New metrics of scholarly impact on the social Web. First Monday. https://doi.org/10.5210/fm.v15i7.2874
Sayyadiharikandeh, M., Varol, O., Yang, K. C., Flammini, A., & Menczer, F. (2020). Detection of novel social bots by ensembles of specialized classifiers. In Proceedings of the 29th ACM international conference on information & knowledge management (CIKM’2’). ACM, (pp. 2725–2732) https://doi.org/10.1145/3340531.3412698
Sugimoto, C. R., Work, S., Larivière, V., & Haustein, S. (2017). Scholarly use of social media and altmetrics: A review of the literature. Journal of the Association for Information Science and Technology, 68(9), 2037–2062. https://doi.org/10.1002/asi.23833
Tahamtan, I., & Bornmann, L. (2020). Altmetrics and societal impact measurements: Match or mismatch? A literature review. Profesional De La Información, 29(1), e290102. https://doi.org/10.3145/epi.2020.ene.02
Thelwall, M. (2015). Evaluating the comprehensiveness of Twitter Search API results A: Four step method. Cybermetrics: International Journal of Scientometrics Informetrics and Bibliometrics, 18, 1.
Thelwall, M., & Kousha, K. (2015a). Web indicators for research evaluation. Part 1: Citations and links to academic articles from the Web. Profesional De La Información, 24(5), 587–606. https://doi.org/10.3145/epi.2015a.sep.08
Thelwall, M., & Kousha, K. (2015b). Web indicators for research evaluation. Part 2: Social media metrics. Profesional De La Informacion, 24(5), 607–620. https://doi.org/10.3145/epi.2015b.sep.09
Thelwall, M. (2018). Altmetric prevalence in the social sciences, arts and humanities: Where are the online discussions? Journal of Altmetrics, 1(1), 1–12. https://doi.org/10.29024/joa.6
Warren, H. R., Raison, N., & Dasgupta, P. (2017). The Rise of Altmetrics. JAMA, 317(2), 131–132. https://doi.org/10.1001/jama.2016.18346
Yu, H., Xiao, T., Xu, S., & Wang, Y. (2019). Who posts scientific tweets? An investigation into the productivity, locations, and identities of scientific tweeters. Journal of Informetrics, 13(3), 841–855. https://doi.org/10.1016/j.joi.2019.08.001
Zahedi, Z., Costas, R., & Wouters, P. (2014). How well developed are altmetrics? A cross-disciplinary analysis of the presence of ‘alternative metrics’ in scientific publications. Scientometrics, 101(2), 1491–1513. https://doi.org/10.1007/s11192-014-1264-0
Zahedi, Z., Fenner, M., & Costas, R. (2015). Consistency among altmetrics data provider/aggregators: What are the challenges?. In: Altmetrics15: 5 years in, what do we know? (pp. 1–3). Amsterdam, The Netherlands. http://altmetrics.org/wp-content/uploads/2015/09/altmetrics15_paper_14.pdf
Acknowledgements
This paper has been supported by a Margarita Salas grant from the Ministerio de Universidades (Spain), funded by the European Union-Next Generation EU, and by the project UNIVERSEO (GV/2021/141), funded by the Generalitat Valenciana (Spain).
Funding
Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Orduña-Malea, E., Font-Julián, C.I. Are patents linked on Twitter? A case study of Google patents. Scientometrics 127, 6339–6362 (2022). https://doi.org/10.1007/s11192-022-04519-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-022-04519-y