Abstract
Objectives
This data note introduces the TFPsocialmedia dataset, designed to aid social media researchers investigating Turkish Foreign Policy (TFP). With over 180 thousand tweets from 3597 accounts across 27 countries, the dataset encompasses intricate social and communication networks. It facilitates analysis of political actors, public perceptions of foreign policy, and policy-related events around TFP. This study illuminates Turkey’s representation within the Twitter community and highlights the efficacy of Twitter data in foreign policy analysis.
Data description
The TFPsocialmedia dataset offers a comprehensive social media data for understanding Turkish Foreign Policy. Curated tweets capture nuances of political discourse, while the social network structure enables insightful research on public perceptions and foreign policy events. Researchers can employ the dataset to unravel online portrayals of Turkey and explore the utilization of Twitter data in foreign policy analysis. This resource is particularly valuable for scholars seeking a deep dive into the digital dimensions of international relations.
Avoid common mistakes on your manuscript.
1 Objective
1.1 Background and data rationale
Social media has emerged as a valuable resource for dissecting foreign policy dynamics, offering insights into public opinion, diplomatic interactions, and the communication of policy decisions [1,2,3]. Among social media platforms (SMPs), Twitter stands out due to its concise nature and rapid information dissemination, making it a prime source for real-time foreign policy insights [4, 5]. Recognizing this, we introduce the TFPsocialmedia dataset, tailored for scholars interested in Turkish Foreign Policy and social media analysis for international politics.
1.2 Dataset overview
TFPsocialmedia encompasses 180,302 tweets from 3597 accounts across 27 countries, focusing on political actors and commentators. It enables analysis of intricate communication networks and public perceptions surrounding TFP. Our dataset, spanning 2007 to 2023, supports statistical, network, and text analysis techniques. It unveils Turkey’s image within the global Twittersphere, offering multifaceted insights into its portrayal and perception.
1.3 Research applications
The TFPsocialmedia dataset has been employed in the study by Mehmetcik et al. Beyond this, the dataset allows exploration of political discourse patterns, sentiment trends, and crisis response dynamics. By connecting Twitter trends with foreign policy events, researchers uncover intricate relationships [6].
1.4 Future directions
Our resource-rich dataset bolsters the study of TFP and extends to broader applications in international relations research. Its insights into communication networks and public perception dynamics offer valuable context for policy analysis. The TFPsocialmedia dataset’s ongoing updates and diverse account selection promise continued relevance and depth for researchers exploring multifaceted aspects of TFP.
In summary, the TFPsocialmedia dataset, illustrated by its application in a notable study, facilitates nuanced exploration of TFP perceptions and communication dynamics. As an adaptable and expanding resource, it holds potential to enrich foreign policy analysis within broader regional and global contexts.
1.5 Data description
The TFPsocialmedia dataset comprises tweets related to the keyword “Turkey,” sourced from a meticulously curated list of 3597 user accounts on Twitter. These accounts encompass individuals, organizations, and news outlets relevant to Turkey-related subjects. To ensure data relevance, the academic Twitter API was employed, utilizing the stream-by-user account option coupled with a keyword filter.
Data Collection Process:
-
User account selection: A comprehensive list of relevant Twitter user accounts was compiled, representing key stakeholders discussing Turkish Foreign Policy.
-
Stream-by-user account option: Leveraging the academic API, tweets from selected user accounts were streamed in real time, ensuring a continuous feed of their content.
-
Keyword filter: To refine the dataset’s focus, tweets containing the keyword “Turkey” were filtered, ensuring alignment with the research topic
-
Data collection: The matched tweets were collected and stored, resulting in a dataset of over 220,000 tweets. Data collection initiative commenced in July 2022 and has been ongoing since then, continuously capturing new data up until the present time.
-
Data cleaning: A rigorous data cleaning process eliminated irrelevant data. For instance, a text processing and filtering script separated unrelated “turkey” content, leaving only tweets relevant to Turkey the country. It is also acknowledged that there is a need for continuous monitoring and refinement of the dataset as the data is continually updated.
-
Final dataset: The cleaned dataset encompasses more than 180,000 tweets, including attributes such as tweet text, date, and user information.
-
Updates: In our working process involves collecting tweets weekly, utilizing specific time scale options. In practical terms, every new search is initiated from the point where the previous one concluded. This approach ensures a systematic and continuous data collection/data cleaning and further analysis process over time.
1.6 Data accessibility and transparency
The dataset, alongside codes and calculations, is publicly accessible on both a Figshare repository[7].Footnote 1 This transparent approach adheres to reproducible social data science practices. Sharing these resources fosters openness and collaboration within the research community, facilitating validation, extension, and deeper insights. The data’s availability promotes robustness and quality in social data science research, enriching the collective knowledge landscape (Table 1).
2 Limitations
-
Language bias: The dataset primarily features English-language tweets, potentially introducing a language bias that could limit its representativeness across non-English-speaking countries. While English serves as a prevalent international language, this limitation might hinder a complete global landscape depiction.
-
Account dominance: The dataset displays a prominent presence of United States (861 accounts), European institutions (778 accounts), and the United Kingdom (429 accounts). This dominance could result from the English-language focus and the political and economic significance of these entities, potentially skewing the representation of other regions.
-
Selection bias: The dataset’s account selection process leans toward politically relevant accounts such as politicians, state officials, and political commentators. This selection bias could underrepresent alternative voices and grassroots perspectives, influencing the overall balance of viewpoints presented.
-
Twitter’s structure: Due to the dynamic nature of Twitter and the potential deletion of older tweets over time, our dataset may have better coverage for the periods in which it was collected compared to earlier periods.
2.1 Mitigation efforts
-
To address language bias, ongoing efforts involve incorporating non-English tweets using translation tools. This expansion aims to enhance global representation and increase the dataset’s inclusivity.
-
While acknowledging selection bias, the dataset’s focus on specific account categories aims to ensure data consistency and reliability, mitigating misinformation. Interpretation of findings should consider the potential influence of selection bias.
As part of our commitment to transparency, we actively acknowledge and address these limitations, striving to improve the dataset’s comprehensiveness and relevance for diverse research inquiries.
Data availability
The data described in this Data Note can be freely and openly accessed on figshare under https://doi.org/10.6084/m9.figshare.24050058 Please see Table 1 and references [7] for details and links to the data.
Code availability
The codes described in this Data Note can be freely and openly accessed on figshare under https://doi.org/10.6084/m9.figshare.24050058.
Notes
We are also provide a specifically curated web page housing a comprehensive dataset on Turkish Foreign Policy, meticulously collected and analyzed for further research purposes at: https://smrlweb.onrender.com As part of our commitment to fostering collaboration and knowledge exchange, we invite fellow researchers and scholars to access and utilize this dataset for their own studies.
Abbreviations
- TFP:
-
Turkish Foreign Policy
- USA:
-
United States of America
- SMPs:
-
Social Media Platforms
References
Zeitzoff T, Kelly J, Lotan G. Using social media to measure foreign policy dynamics: an empirical analysis of the Iranian–Israeli confrontation (2012–13). J Peace Res. 2015;52(3):368–83. https://doi.org/10.1177/0022343314558700.
Baum MA, Potter PBK. Media, public opinion, and foreign policy in the age of social media. J Polit. 2019;81(2):747–56. https://doi.org/10.1086/702233.
Manor I, Segev E. Social media mobility: leveraging Twitter networks in online diplomacy. Global Pol. 2020;11(2):233–44. https://doi.org/10.1111/1758-5899.12799.
Schmitt L. What’s in a tweet? Twitter’s impact on public opinion and EU foreign affairs. docCIDOB. 2021. https://doi.org/10.24241/docCIDOB.2021.11.
Dersan Orhan D. Making foreign policy through Twitter: an analysis of Trump’s tweets on Iran. In: Esiyok E, editor. Advances in multimedia and interactive technologies. USA: IGI Global; 2021. p. 380–94. https://doi.org/10.4018/978-1-7998-3201-0.ch022.
Mehmetcik H, Koluk M, Yüksel G. Perceptions of Turkey in the US Congress: a Twitter data analysis. Uidergisi. 2023;19(76):69. https://doi.org/10.33458/uidergisi.1226450.
Mehmetcik H. TFPsocialmedia dataset. figshare. https://doi.org/10.6084/m9.figshare.24050058. 2023
Acknowledgements
The founding organization is The Scientific and Technological Research Council of Türkiye (TÜBİTAK).
Funding
This work was supported in part by a TUBITAK Grant (121K890).
Author information
Authors and Affiliations
Contributions
All authors contributed significantly to various aspects of this study. The study’s conception and design were a collaborative effort involving Hakan Mehmetcik, Murat Can Ganiz, Melih Koluk, Muslim Yilmaz, Muhammed Mustafa Ince, Galip Yuksel, and Emre Tortumlu. Hakan Mehmetcik and Murat Can Ganiz were responsible for material preparation, data collection, and analysis, with Hakan Mehmetcik taking the lead in drafting the initial manuscript. Murat Can Ganiz provided substantial input through commentary on previous versions of the manuscript. Throughout the study’s progression, all authors engaged in thorough discussions, critically assessed the content, and offered their insights. The final manuscript was reviewed and approved by all authors, demonstrating their collective agreement on its content and findings.
Corresponding author
Ethics declarations
Competing interests
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Mehmetcik, H., Ganiz, M.C., Koluk, M. et al. TFPsocialmedia: a public dataset for studying Turkish foreign policy. Discov Data 2, 3 (2024). https://doi.org/10.1007/s44248-024-00009-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s44248-024-00009-z