1 Introduction

Google Trends (https://trends.google.com/) is a Google product that examines the popularity of Google search queries in various countries and languages. It serves as a popular data source and has been utilized in hundreds of studies, specifically 657 during the period from 2006 to 2016,Footnote 1 across various fields, including information systems and computer science, healthcare, economics, business, finance, and political science [5]. However, this data source also has its limitations. Franzén [4] highlighted concerns regarding the replicability and reliability of Google Trends based on several experiments. One limitation is that the search results are derived from samples rather than the entire population. Some studies have acknowledged this sampling error issue [2, 3, 15] and proposed solutions [3]. Another problem is that Google regularly updates its data collection methods, which may introduce measurement errors. To the author's knowledge, Lazer et al. [6] is the earliest study that has discussed this issue. However, since the details of the updated data collection methods are not publicly available, Lazer et al. [6] may be to some extent speculative. Additionally, it is worth noting that the findings of Lazer et al. [6] were based on an analysis conducted a decade ago. To ascertain whether this issue persists, further research is necessary.

Most recently, Google Trends updated its data collection method on 1 January 2022 (refer to the ‘Note’ in Fig. 1 below).Footnote 2 This initiative aimed at enhance its algorithms for more accurate identification of user queries. Myburgh [11] observed that values extracted after the implementation of this change were higher compared to those extracted prior to the alteration. However, it was acknowledged that the timeframe for analysis is relatively short and confined to a single field, underscoring the need for further research. Several studies have conducted research using Google Trends data prior to 2022, enabling a comparison of search results before and after that date. This study presents straightforward evidence demonstrating the presence of real and potentially significant measurement errors in Google Trends data. These errors have the potential to alter the significance of coefficients, potentially resulting in erroneous conclusions regarding international relations. Furthermore, they may adversely affect forecasting accuracy in fields such as business, economics, and healthcare, ultimately leading to incorrect decisions and incurring substantial costs in terms of both monetary resources and human lives. All the data from this study can be accessed through Mendeley via the following link: https://doi.org/10.1007/s44248-024-00013-3.

Fig. 1
figure 1

Monthly worldwide search results for ‘data’ on Google Trends conducted on 14 June 2023

The structure of this paper is as follows: Sect. 2 provides a brief introduction to Google Trends and its nature, aiming to establish a theoretical foundation for validating the contents of Google Trends results. Section 3 discusses several search cases that highlight the measurement errors in Google Trends and their potential causes. Section 4 presents a concise analysis of the implications of this issue in the application of Google Trends. Finally, Sect. 5 concludes the paper.

2 Google trends

This section introduces Google Trends and its nature, drawing from several previous studies [7, 8].

Google Trends analyzes the popularity of Google search terms across countries and languages. Its attributes include aggregation, topic categorization, and anonymity. Google Trends provides data that is current for the last seven days, daily for the previous eight months, weekly for the previous five years, and monthly for longer periods. It collects samples of searches conducted by users.

Additionally, Google Trends displays search results on a scale of 0–100. Each point on the graph is divided by the highest point (100) and normalized to the time and location of the query. Consequently, regions with similar levels of interest in a particular phrase may exhibit significantly different total search volumes. By normalizing the data, it becomes possible to compare queries across time periods and geographical areas. Essentially, the data reflect search interest in relation to the highest point on the map for the specified location and time period. A score of 100 indicates that the term is the most popular, while a score of 50 suggests it is halfway as common, and a score of 0 means there is insufficient data to determine the term's popularity.

From a communications perspective, Scharkow and Vogelgesang [14] interpreted Google Trends as a measure of the public agenda, representing the subjects that the public considers most important. However, they did not examine the informational content of Google Trends inquiries. Building on empirical evidence, Maurer and Holbach [9] further established a strong connection between media coverage and Google Trends. Ripberger [13] also validated this association using selected policy topics covered by the New York Times as examples. Oehl, Schaffer, and Bernauer [12] discovered, based on empirical evidence, a significant alignment between Google Trends and empirical metrics of media salience and politicization. In essence, Google Trends can be interpreted as a measurement of the aggregated effects of agenda-setting, encompassing policy issues, media coverage, and individuals' attention to specific topics. This definition will aid in validating the contents of Google Trends data.

3 Measurement errors

In this section, three cases are presented as examples to illustrate the measurement errors of Google Trends. The search phrases used are ‘covid-19,’ ‘decoupling China,’ and ‘debt trap diplomacy.’ These concerns are broadly associated with matters concerning public health, security considerations, and economic repercussions. Additionally, since this study encompasses replication studies (as outlined in Sect. 4), the accessibility of data becomes a pertinent factor. It is essential to clarify that this study does not purport to assert a universal problem with Google Trends. To gain deeper insights into this issue, further investigations into the utilization of Google Trends in diverse domains are warranted.

Furthermore, the concerns raised in this study were communicated to Google on 7 June 2023. However, as of 12 May 2024, no responses have been received.

3.1 ‘Covid-19’

COVID-19 (coronavirus disease 2019) was first identified in an outbreak in the Chinese city of Wuhan in December 2019. Initially, it was referred to as ‘coronavirus,’ ‘Wuhan coronavirus,’ or ‘Wuhan pneumonia.’ On 11 February 2020, the World Health Organization (WHO) officially named it ‘COVID-19.’ Fig. 2 below depicts the search results conducted on 15 June 2023 using the key phrase ‘covid-19’ (note that search results are not case-sensitive).

Fig. 2
figure 2

Search results for ‘covid-19’ conducted on 14 June 2023, Google Trends monthly data from January 2013 to November 2019, worldwide

Firstly, it is surprising to observe numerous search results on ‘covid-19’ before December 2019. Since this virus emerged in December 2019 and acquired its official name ‘covid-19’ in February 2020, it is implausible for people to have searched for this term between 2013 and 2019. It could be argued that these results stem from sampling errors. Eichenauer et al. Eichenauer et al. [3] proposed that employing a substantial number of samples, up to 12, may effectively mitigate sampling errors. However, irrespective of the number of samples used, these results persist. Thus, these search results are not merely noise but rather indicative of measurement errors.

Google Trends offers topic-based search functionality. Search terms yield matches for all terms within the query, based on the language provided. Topics encompass a cluster of terms sharing the same concept across languages. For instance, when searching “covid-19,” topics like “Coronavirus disease 2019 (Disease)” or “Symptoms of COVID-19 (Topic)” are automatically generated. These topics may cover a range of related terms such as other coronaviruses, early references to the Wuhan virus, or specific symptoms predating COVID-19. However, this distinction does not apply to this study, which solely relies on individual terms.

Secondly, the question arises: how are these search results obtained? As discussed in Sect. 2, the surge in search interest is assumed to be driven by factors such as policy announcements, media coverage, or individual attention to the issue (e.g., someone searching for ‘covid-19’ if they have been infected by the virus). It is assumed that there will be a time lag between these factors and the subsequent Google search results. Figure 2 reveals that the highest peak occurred in April 2015. To investigate what transpired during March–April 2015, a Google Advanced search was conducted, and the following link displays the search results for that period: https://www.google.ca/search?q=covid-19&lr=&safe=images&source=lnt&tbs=cdr%3A1%2Ccd_min%3A3%2F1%2F2015%2Ccd_max%3A4%2F30%2F2015&tbm = (accessed on 20 June 2023). However, these results show no content related to COVID-19 during March–April 2015. One plausible explanation is that some COVID-19-related content was later combined with content produced during March–April 2015, thus misleading the algorithm.

While the above analysis demonstrates the existence of measurement errors, it does not establish that this phenomenon is exclusive to the post-2022 period. The following two cases are further examined below.

Another search conducted on 12 May 2024 (see Appendix 1) indicates that these previously observed “measurement errors” have been rectified. Following the communication of the aforementioned issue, including the case of “covid-19,” to Google in June 2023 (though feedback was not forthcoming), it appears plausible that Google Trends has indeed updated its system. Nonetheless, a distinct issue arises, as evidenced by other example terms.

3.2 ‘Decoupling China’

Liu [7] examined the US-China decoupling using Google Trends data retrieved before 2022. The relationship between the United States and China has been significant for governments, organizations, firms, and investors worldwide. In recent years, the competition between the US and China has intensified, leading both sides to explore ways to reduce their economic interdependence, known as decoupling. Figure 3 below illustrates the search results for ‘decoupling China’Footnote 3 conducted in August 2021.

Fig. 3
figure 3

Source: Liu [7]

Search results for ‘decoupling China’ conducted on 1 August 2021, Google Trends weekly data from 3 July 2016 to 20 June 2021, in the US.

Liu [7] conducted a comprehensive analysis of the surge in search interest. Figure 3 indicates that the search interest in ‘decoupling China’ reached its first peak around the week of 2 September 2018. Several fundamental reasons contributed to this surge. The tipping point was the US-China trade war, which began on 6 July 2018 when the Trump administration announced tariffs of 25 percent on $50 billion worth of Chinese exports, prompting China to retaliate with tariffs targeting US goods. This trade decoupling policy generated numerous analyses and reports on topics such as the economic costs of technological and commercial decoupling, the impact on global economic growth, and implications for geopolitical competition. The emergence of this new policy and the subsequent media and analyst coverage drove the surge in interest.

The highest peak in search interest occurred in late August and early September 2020. This surge also had fundamental reasons, primarily related to then-President Donald Trump’s remarks on decoupling from China, as well as previous discussions and analyses. The second-largest peak took place in mid-June 2020, also associated with Mr. Trump’s comments on US-China decoupling (for more details, refer to Liu [7]).

Figure 4 below demonstrates the search results for ‘decoupling China’ conducted in June 2023.

Fig. 4
figure 4

Search results for ‘decoupling China’ conducted on 14 June 2023, Google Trends weekly data from 3 July 2016 to 20 June 2021, in the US

Figure 4 reveals a significant number of search results before September 2018. As previously discussed, effective US-China decoupling started in July 2018 when tariffs were imposed on goods imported from each side. Since then, discussions on US-China decoupling have intensified. It is implausible for people to have searched for ‘decoupling China’ before 2018. Once again, these pre-2018 results may be attributed to sampling errors. However, similar to the search results for ‘covid-19,’ these results persist regardless of the number of samples used, indicating measurement errors.

Additionally, Fig. 4 shows that search interest peaked in the week of 31 July 2016. A Google Advanced search (https://www.google.ca/search?q=decoupling+china&lr=&safe=images&tbs=cdr:1,cd_min:7/24/2016,cd_max:8/6/2016&ei=kj-RZMjFKsnMseMP9uyF4AI&start=0&sa=N&ved=2ahUKEwjI1rCCldH_AhVJZmwGHXZ2ASw4ChDy0wN6BAgIEAQ&biw=1280&bih=577&dpr=1.5. Accessed on 20 June 2023) reveals that there is minimal relevant content during that period, except for a few articles on energy economics. It is improbable for this technical topic alone to drive the search interest to such high levels. Furthermore, a webpage dated 26 July 2016 (https://www.eurasiagroup.net/issues/chinas-rise. Accessed on 20 June 2023) contains an article on financial decoupling from China. However, this article appeared on the webpage much later, potentially misleading the search algorithm, as observed in the case of ‘covid-19’ search results.

Another search conducted on 12 May 2024 (see Appendix 2) reveals that the previously abundant results have now dwindled. As previously discussed, the topic of decoupling from China was a highly debated issue in the US from 2018 to 2021. According to Black and Morrison [1], the specific phrases ‘decouple from China’ or ‘decoupling from China’ appeared three times more frequently in articles during the first ten months of 2020 compared to the preceding three years combined. It seems improbable that search results would be this scarce. As mentioned earlier, while it is conceivable that Google Trends has updated its system (as per the author’s remarks), the issue has transformed from excess to insufficiency.

3.3 ‘Debt trap diplomacy’

Another study examined China's debt trap diplomacy, referred to as Liu [8], using data retrieved from Google Trends before 2022. Debt trap diplomacy refers to an international financial relationship in which a creditor country, particularly China, extends debt to a borrowing country to increase the lender's political leverage. Figure 5 below displays the search results for ‘debt trap diplomacy’ conducted in November 2021.

Fig. 5
figure 5

Source: Liu [8]

Search results for ‘debt trap diplomacy’ conducted on 7 November 2021, Google Trends monthly data from January 2013 to October 2021, worldwide.

Figure 5 demonstrates that the surge in interest in ‘debt trap diplomacy’ emerged in August 2017. According to Liu [8], this surge has fundamental reasons. Specifically, the signing of a 99-year lease agreement between a Chinese firm and the Sri Lanka Ports Authority to develop Hambantota Port on 29 July 2017 immediately drew the attention of scholars and analysts commenting on China’s debt trap diplomacy and its implications. This timing aligns with the geopolitical dynamics. For instance, the term ‘debt trap diplomacy’ was coined by an Indian scholar in 2017, and discussions on this topic gained momentum in 2018 (for more details, refer to Liu [8]).

Additionally, Fig. 5 shows that the largest peak occurred in May 2020. This surge also has its fundamental reasons. According to Liu [8], the increase in search interest is primarily related to the COVID-19 pandemic and its impact on debt-laden Belt and Road projects.Footnote 4

Figure 6 below presents the search results for ‘debt trap diplomacy’ conducted in June 2023.

Fig. 6
figure 6

Search results for ‘debt trap diplomacy’ conducted on 14 June 2023, Google Trends monthly data from 3 July 2016 to 20 June 2021, worldwide

Figure 6 reveals a significant number of search results before 2017. As discussed previously, this phrase was only coined in 2017, making it impossible for people to search for it before that. Once again, averaging sample data cannot eliminate these results. They are simply measurement errors. Additionally, there are inconsistencies in the peak data.

Furthermore, Fig. 6 shows that the largest peak in search interest before 2017 occurred in July 2014. A Google Advanced search (https://www.google.ca/search?q=debt+trap+diplomacy&lr=&safe=images&biw=1280&bih=577&source=lnt&tbs=cdr%3A1%2Ccd_min%3A6%2F1%2F2014%2Ccd_max%3A7%2F31%2F2014&tbm = . Accessed on 20 June 2023) indicates that there is no content related to debt trap diplomacy during that month or the preceding month. While this result makes sense, the surge in search interest becomes a mystery.

Another search conducted on 12 May 2024 (see Appendix 3) demonstrates a consistent shift from excess to insufficiency in results. As previously noted, the discourse around debt trap diplomacy was significant during 2018–2021. Such minimal search interest does not seem reflective of the actual discourse.

4 Implications

As we conclude from Sect. 3, it is very clear that Google Trends data have measurement errors resulting from changes in their data collection method. This has important implications for further applications of this popular data source.

First, when it comes to international relations, the applications of Google Trends need to be approached with caution. Based on several studies cited in this study, which are the first of their kind in applying Google Trends data-based time series models to international relations, sampling errors and measurement errors were considered largely irrelevant. The main reason for this is that, unlike the fields of business, economics, and healthcare that focus on forecasting, scholars in international relations are primarily concerned about the sign rather than the magnitude of coefficients. However, when using Google Trends data retrieved in 2023, this study finds that the results obtained, as in Liu [7, 8], cannot be fully replicated. While the performance of some variables remains the same, others have changed in terms of their significance level and sign. Considering the evident measurement errors discussed in Sect. 3, this change in results may not be surprising. The results obtained in May 2024, as depicted in Appendices 2 and 3, lack sufficient size to facilitate meaningful causal analysis.

In response to the emergence of measurement errors, a few steps need to be taken before applying Google Trends data to infer causal relations. First, if possible, it is important to check the start date of the data based on theoretical grounds. This step will help determine whether measurement errors are present. Second, checks on peak data should be conducted as thoroughly as possible, using tools such as Google Advanced Search, to identify the fundamental factors behind the surge of search interest. Third, and most importantly, any causal relations drawn based on Google Trends data need to be supported by theoretical foundations.

Additionally, it is likely that if the search term is popular enough, the measurement errors may be less significant. The rationale behind this is that, as discussed in Sect. 2, Google Trends search results are normalized data on a scale of 0–100. Therefore, if the peak data point is large enough, the measurement errors may be relatively small. Figure 7 below shows the search results for ‘covid-19’ since 2013.

Fig. 7
figure 7

Search results for ‘covid-19’ conducted on 14 June 2023, Google Trends monthly data during Jan 2013-May 2023, worldwide

Figure 7 illustrates that search results for COVID-19 before January 2020 have become very small. However, many of them are not zero yet and are simply labeled as ‘ < 1’.

In the fields of business, economics, and healthcare, where the applications of Google Trends mainly revolve around forecasting, while sampling errors can be effectively eliminated [2], measurement errors remain a source of forecasting inaccuracy. The suggestions outlined above remain the same for these domains. Most importantly, authors who have utilized Google Trends data retrieved before 2022 are encouraged to replicate their studies using data sets obtained at a later time. If these studies have already considered sampling errors, any new forecasting errors can then be attributed to measurement errors.

5 Concluding remarks

Google Trends is a popular data source that has been applied in hundreds of studies across various fields, including information technology, business, economics, healthcare, and political science. While it offers several advantages such as free access and easy availability, it also has its limitations. Previous studies have addressed the issue of sampling errors associated with Google Trends. However, this study focuses on the measurement errors resulting from changes in its data collection method and presents clear evidence demonstrating the existence of such measurement errors. These errors have the potential to yield inaccurate conclusions in the realm of international relations and can lead to flawed forecasting in fields like business, economics, and healthcare.

Specifically, by examining key phrases such as ‘covid-19,’ ‘decoupling China,’ and ‘debt trap diplomacy,’ this study reveals that Google Trends can generate many search results during periods when these terms were not yet coined. By comparing these results with those obtained before 2022, this study demonstrates that these ‘unusual’ search results only emerged after 2022, when Google updated its method of collecting search data. These findings clearly indicate the presence of measurement errors in Google Trends data. Therefore, scholars are advised to take additional steps to validate the content of Google Trends data. This may include conducting checks on the data's initiation date and scrutinizing peak data. After communicating the mentioned issue to Google (with no feedback received as of 12 May 2024), the search results have shifted from being “too many” to “too few.” However, these “too few” results do not align with reality.

Authors in the fields of business, economics, and healthcare are encouraged to replicate their studies conducted before 2022 using new data sets retrieved after 2022 to assess the potential occurrence of significant forecasting errors. Moreover, certain Google Trends results have been validated through benchmarking against survey data [10]. For these terms verified prior to 2022, future studies can delve deeper into their validation by utilizing data obtained after 2022.

A limitation of this study lies in its focus on specific, limited examples, rather than claiming a universal issue of measurement errors with Google Trends. The study’s primary aim is to raise awareness about this matter, with the intention of prompting more investigations in the future.