1 Introduction

The fediverse is a federated network, i.e., a network with servers that operate in a decentralized manner. ActivityPub is used as a communication protocol, which is standardised by the World Wide Web Consortium (W3C) (Lemmer-Webber et al. 2018). We believe that decentralised networks are more ethical and privacy-respecting alternatives to centralized Big Tech networks. Monitoring, profiling, and privacy issues may arise in networks centrally controlled by companies for the purpose of relevant recommended content to their users (Duskin et al. 2024; Dujeancourt and Garz 2023), and also logically because users do not have a way to own their data. Decentralized networks have a different model, and are seen as more ethical because they give the opportunity to own and control one’s data as users can have their own instances and post on their own servers, and as far as we know, there is no data collection and sharing with third parties in their operational models, although issues with metadata are identified (Greschbach et al. 2012). Thus, they are alternatives that currently respect the freedom from non-interference, as Coeckelberg talks about AI in general in Coeckelbergh (2022). Existing work regarding Mastodon focuses on adoption rates (La Cava et al. 2021), as well as trending topics analysis (Al-khateeb 2022b). In this work, we focus on the adoption rates based on instance types, and topics. We study the adoption rates statistically by analysing account registration, instance creation, and account activity. Furthermore, following promising results regarding country specific instances from Sabo et al. (2024), we expand our initial paper with comparisons and analysis of country specific instances.

Mastodon uses the aforementioned ActivityPub protocol, which provides a client-to-server API for content management and a server-to-server API. Servers involved in Mastodon are called instances. Each user has an inbox for incoming messages and an outbox for outgoing messages. Within the client-to-server architecture, notifications are published to the sender’s outbox, and to view them, the actors must request access. In federated server settings, notifications are dispatched directly to the intended recipient’s inbox, and only this setup offers subscription functionalities. Moreover, the sender needs a followers list, and non-federated actors must know all senders. This fosters a unique inter-server communication environment where data is stored and shared selectively based on actor type and server federation (Ilik and Koster 2019).

Mastodon’s software is free and open-source. A user can set up a Mastodon instance, or register on one of the instances that offer account registration, and accounts on different instances can communicate with each other through a federated timeline. Three types of feeds can be accessed: (a) ‘Home’ feed, where the user can see posts from people they follow, (b) ‘Local’ feed, where the user can see posts from all people from the specific instance they belong to, (c) the ‘Federated’ feed, where the user can see posts from all other instances of Mastodon (La Cava et al. 2021). Additionally, there is an ‘Explore’ feed where the user can view the most popular posts by users in various instances. While most Mastodon instances are public, there are also private ones, where the owner of the instance can admit people to the server based on their discretion. Mastodon provides functionalities, such as posting, replies, favourites, bookmarks and hashtags. Mastodon saw a spike in popularity around November 2022, after the acquisition of Twitter, a change seen also in Stokel-Walker (2022). The number of unique users on Mastodon almost doubled since the Twitter acquisition, from just a little less than 5 million in early November 2022 to 8.7 million in March 2023 (Socialhome 2023). We also prove this growth for some instances in this work. Our motivation is to study the adoption of these decentralized alternatives and the growth rate, so we pose our main research question:

  • What insights about longitudinal dynamics with regards to account creation growth rates can we identify on Mastodon?

Related to our main research question we focus on studying the following ones in particular:

  1. (a)

    What can we infer regarding the adoption rate of Mastodon users within popular instances, when taking into account registration on instances as well as posts?

  2. (b)

    What insights can we get when studying the adoption rate considering instances dedicated (largely) to specific topics?

  3. (c)

    What insight do the centrality metrics give us regarding influential accounts on Mastodon?

  4. (d)

    What initial insight can we get regarding group and community structure on Mastodon?

Motivated by the opportunity to investigate how a decentralized social networking system evolves during a certain time period, in this work, we have analyzed the adoption rate of users, based on types of instances, location but also topics of discussion. Following the examination of user adoption of large and popular instances, we further investigate it from a country-specific point of view, where we also compare two nations between themselves. We report a longitudinal analysis of several Mastodon instances and our insights into the structure of the largest Mastodon instance. In addition, we report adoption rate analysis focused on topics, such as academic, political as well as general activism. The structure of the paper is as follows. In Sect. 2 we discuss related work, we elaborate on our dataset and methodology in Sects. 3, 4 presents our analysis and results on a longitudinal analysis based on account registration including centrality metrics and community-analysis algorithms, Sect. 5 shows our results of a study of adoption which includes activity. Section 6 provides a country-based analysis and results focused on instances based on topics, whereas Sect. 7 discusses some limitations of our collected data. We conclude the paper, also discussing future work in Sect. 8.

2 Related work

Research on Mastodon instance dynamics has been scarce but some recently conducted work exists since it’s a fairly new network and has existed since 2016. We will discuss some relevant work next.

2.1 Network analysis of Mastodon

Collecting and analyzing data from academic servers will add additionally to the data and analysis of an existing project, that aims to bring valuable results into the adoption rate of alternatives such as Mastodon.

Zignani et al. conducted one of the first large-scale studies on Mastodon users in 2018 (Zignani et al. 2018). The research focused on analyzing the existing dataset from Mastodon at the time and studying the network growth and relationships between instances. They found differences with mainstream social media platforms, as users tend to follow other users and instances based on interests, rather than popularity. In another paper of the same author, they study the impact of decentralized architecture on social relationships within Mastodon (Zignani et al. 2019). On the user level, they find that users are more influenced by geography or culture than the architecture and they would explore few instances. Moreover, on the instance-level the architecture of each server affects the clustering of users’ ego-networks and that each instance has a unique footprint (Zignani et al. 2019). In 2021, more extensive research has been conducted by La Cava et al. (2021). They built upon Zignani’s datasets and analyzed the data from that time, as Mastodon grew five times since the last big research on large sets of data. This new study reinforces Zignani’s findings and provides further insight into connections between users from multiple instances within the fediverse. La Cava researched network instances of Mastodon on a macroscopic and mesoscopic level and analyzed how these instances evolve. This study concluded that Mastodon has achieved "structural stability and a solid federative mechanism among instances" (La Cava et al. 2021). In 2022, La Cava et al. followed up their research by studying the relations and roles of the users in Decentralized Open Source Networks (DOSNs) (La Cava et al. 2022). They state that links between users on Mastodon are interest-based and not artificially stimulated, resulting in a decoupled network. The researchers found that there are two main roles the users can have on the platform: bridge or lurker. A bridge user is someone who is active in more than one instance and acts as a bridge between these instances. A lurker user is someone who rarely contributes on a Mastodon instance, but they are considered active users as they stay online on the platform and "consume information". The user migration that followed this acquisition is analyzed in Drivers of social influence in the Twitter migration to Mastodon (La Cava et al. 2023), where they go further into detail of how these users influence other users as well.

Another view is described in Challenges in the Decentralised Web: The Mastodon Case (Raman et al. 2019), as they focus more on the difficulties of decentralization. The aforementioned works report positive conclusions regarding Mastodon, such as its ability to enable community autonomy, technical development as a social enterprise, quality engagement, and niche communities (Zulli et al. 2020). Nicholson et al. analyzed the code of conduct rules across popular Mastodon instances and compared them to rules on Reddit’s Subreddits. They compiled a dataset of 3,503 rules from the top 1000 Mastodon instances by web scraping their "About" pages. The researchers coded and categorized these rules, finding that Mastodon’s rules emphasize creating a safe space free from harassment and hate speech (Nicholson et al. 2023). Al-Khateeb studied the largest Mastodon instance (mastodon.social), examining popular hashtags, the impact of bots, user locations, and sentiment/toxicity analysis. For 35 days, data was collected via Mastodon’s API and analyzed using tools like Google’s Perspective API to detect toxicity levels in posts. The findings suggest decentralized social networks (DOSNs) like Mastodon give users more control over their feed content compared to centralized platforms’ recommendation algorithms (Al-khateeb 2022a). Nobre et al. conducted an image analysis of pictures shared on Mastodon’s federated timeline. Using Google’s Cloud Vision API and Image Search, they identified explicit content, image sources, and how images propagated across Mastodon instances. The researchers found most images originated from centralized platforms before being posted on Mastodon (Nobre et al. 2022).

2.2 User migration from Twitter to Mastodon

Zia et al. analyzed the migration of users from Twitter to Mastodon in the weeks of the Twitter acquisition. They found out that new users have mostly registered on a popular instance, such as mastodon.social. They explain this as a metric for centralization in Mastodon, as 96% of users are registered in the top 25% of biggest instances (Zia et al. 2023 and users do this because they are used to networking in large social media platforms, like Twitter. Moreover, they state that only a fraction of users who migrated to the largest instance of Mastodon will migrate again to a more specific instance based on their preferences for the topic. Furthermore, Zia et al. identify the two main reasons why a user would leave Twitter: (1) Ideological reason—the user does not agree with the new company’s actions; (2) Following account reason—people they follow migrated there already. Jeong et al. studied users who migrated from Twitter to Mastodon and cross-referenced matching users across both platforms (Jeong et al. 2023). Their findings showcase that Twitter is used for debating social issues, while Mastodon is used for niche topics. Their main finding is that the user behaviour and Mastodon’s unique features contribute to user adoption of the decentralized platform. Jeong et al. follow up with their research by studying the migration of users from Twitter to multiple other platforms, such as Mastodon, Threads or Bluesky. It investigates migrant groups, migrant patterns and user perspectives on the new platforms (Jeong et al. 2024). The results reveal insights into the competitive dynamics and factors shaping user behaviour in the social media landscape In 2023, La Cava et al. studied the user migration from Twitter to Mastodon using a similar timeframe examined by Jeong et al. (2023) and Zia et al. (2023). The authors focus on the structure of the social media platform, the engagement of community members, and the language they use to communicate, all from the perspective of Twitter and what drives a Twitter user to migrate to Mastodon. The results are surprising, as they discovered that users networking in sparse communities are more likely to migrate to Mastodon (La Cava et al. 2023). Additionally, users who engaged in conversations on Twitter regarding popular migration hashtags, such as #TwitterMigration, have a greater tendency to migrate to Mastodon. The authors in Zia et al. (2023) have found that one of the reasons for people adopting Mastodon is related to data control, because it provides the ability to control personal data, and consequently also more control over the use of personal data by (third-party) data mining activities, as opposed to Big Tech platforms, in which users have no control over their data, except for what is provided by legislation such as e.g., GDPR in the EU. Centralization tendencies in the context of account distribution on Mastodon are studied by the authors in Lee and Wang (2023), who report that 96% of users join 25% of the largest instances. Wang et al. perform a similar analysis to the one that Jeong et al. (2024) did, as they focus on users who migrated from Twitter to Mastodon, Bluesky and Threads, however, their focus was on the academic community who are active on social media platforms. The study found that academic communities faced significant challenges in sustaining engagement on Mastodon, with wanting enthusiasm and fragmented communities compared to centralized platforms (Wang et al. 2024).

3 Methodology

3.1 Datasets

To comprehensively analyze Mastodon, a thorough data collection strategy was essential. This involved data crawling, which facilitated the gathering of extensive and meaningful information about both accounts and instances of Mastodon. To accomplish this, a series of software crawlers were developed using C#, which were designed to interact with the Mastodon REST APIs (Mastodon 2022). It was essential to select a method that would permit access to the necessary data without the risk of modifying the content of the server. Upon execution of the GET methods, the resulting data was received in JSON format. The Mastonet package (Lacasa 2017), a.NET library was used, for efficient handling of the Mastodon API methods in C#, as it’s specially designed to facilitate easy interaction with Mastodon APIs. Among the top ten most popular instances in April/May 2023, the five instances that we collected data from were the instances with more dynamics. We prioritized instances with English as the primary language, as their popularity and large diverse user base, particularly on mastodon.social, attracted many non-native English speakers.. The data was collected between the 15th of April and to 7th of June 2023 for the five instances, and the data for single-user instances was collected on the 6th of August 2023. In addition to collecting general data from five instances, four other particular datasets were used, focused on types such as academic, Turkish, political and activism. The academic dataset consists of the servers listed on Mapperfr (2023) except for the ones which do not use HTTPS. The Turkish dataset consists of two servers: mastodon.com.tr and mastoturk.org. The political dataset consists of the following servers: social.overheid.nl, eupolicy.social, social.network.europa.eu, kolektiva.social, sociale.network, progressives.social, leftist.network, thinktanki.social, thecanadian.social and todon.nl. Finally, the activism dataset consists of three servers: bhre.social, blacksun.social and climatejustice.social.

The academic, political and activism datasets were chosen mostly to see if and how much the TwitterFootnote 1 acquisition has influenced these sectors.

3.2 Data pre-processing and properties

To ensure the validity and reliability of this research, cautious data cleaning and pre-processing steps were applied post-crawling. This included handling missing or inconsistent data, such as null values as a result of the user deleting their account, and formatting the data appropriately for further analysis. The gathered data involving certain account information, such as user ID, username, account name or display name is deliberately excluded from this work in accordance with privacy rules and our principles in protecting user identities. Even IDs were anonymized by adding noise. This approach is designed to enhance the transparency of our investigative results while ensuring that the privacy of a user is maintained. We give utmost importance to privacy, so we are not only guided by GDPR but with privacy as a human-rights principle. Our data served as an in-depth longitudinal examination of the evolution and progression of Mastodon as a DOSN. Table 1, shows a summary of the dataset:

Table 1 Summary of data collected from all five instances

4 Dynamics Analysis and Results

This section investigates the analysis of gathered data from multiple Mastodon instances. Python was utilized for the analysis of the dataset. The NetworX and NetworKit libraries were used for analyzing the mastodon.social instance, focusing on gaining insights into centrality metrics, communities, and groups. NetworkX is a library that specializes in complex network and graph creation, manipulation, and analysis (Hagberg et al. 2008). NetworKit is a comprehensive open-source toolkit that specializes in high-performance network analysis (Staudt et al. 2014). The analysis of the data was conducted on a personal computer, and Habrok—the high-performance computing cluster of the University of Groningen (Center for Information Technology 2023), which proved efficient in performing large-scale network analysis.

4.1 Account number dynamics

This section navigates through various data points acquired by evaluating the dates of account formations on numerous instances across different time markers/checkpoints. This investigation offers insight into the evolution, adaptation, and patterns of each five Mastodon instances from which general account information has been gathered, as explained in Table 1. The visualization of the data throughout this study has been implemented using Python, specifically utilizing the Matplotlib Hunter (2007) and Pandas McKinney (2010) libraries.

The graph in Fig. 1, offers a visual representation of the evolution of instances over the years since the year each instance was created. We found that there was a growth peak across all 5 instances in 2022. However, 2023 was marked by a big contraction for all instances, except mastodon.social. The latter kept almost a constant growth of users over the first 5 months of 2023.

Fig. 1
figure 1

Yearly overview of the registered accounts on five instances. See also Sabo et al. (2024)

Figure 2 shows that all instances experienced user growth in November 2022, while for mastodon.social this was not significant. The growth of mastodon.cloud is attributed to bot users. Over 95% of its user base comprises of bots, and the majority of them created their account in November 2022. Furthermore, we see a continuous decrease in new accounts following December 2022 for all instances going into 2023. The contraction in the growth of Mastodon instances after November 2022 can be attributed to the initial growth of users dissatisfied with Twitter’s change in ownership, who quickly migrated to Mastodon. Once the hype surrounding Elon Musk died down, this migration significantly decreased, causing the growth of Mastodon instances to revert to levels similar to those prior to November 2022.

Fig. 2
figure 2

Monthly evolution of the number of newly registered users on five instances after the Twitter ownership change. See also Sabo et al. (2024)

4.2 Network-influence based on centrality metrics

We applied the three well-known centrality metrics to our dataset: Degree Centrality, Closeness Centrality and Eigenvector Centrality. We chose these centrality metrics because they allow us to identify influential users within our dataset from multiple perspectives, in a collective manner. Centrality metrics provide a broad understanding of network influence, capturing not just direct connections but also the quality and reachability of those connections. The subsequent analysis derived from each centrality measure facilitated the identification of accounts of high relevance and influence within the network, specifically those ranking in the top 20 of each list (Newman 2018). The centrality algorithms were applied to 87.210 mastodon.social accounts, that have a total following of 864,588 accounts, from 18.847 Mastodon instances. Degree Centrality quantifies the number of connections or neighbours to an account. The account’s centrality increases with its number of connections. Accounts with a high degree of centrality typically exhibit high levels of activity or interaction within the network (Newman 2018). Closeness Centrality offers another perspective on node significance within a network, emphasizing the ’distance’ between nodes rather than just the quantity of connections (Riveni 2022). The algorithm used for this metric is calculated as the inverse of the average of the shortest path between a vertex and all other vertices. Nodes with high Closeness Centrality have shorter average distances, enabling efficient information dissemination (Newman 2018).

Regarding the research on mastodon.social, nodes with high Closeness Centrality play crucial roles in rapid information spread. Proximity to other nodes aids effective communication. Higher Closeness Centrality implies greater centrality within the digital community (Newman 2018). Eigenvector Centrality measures the importance of a node by the quantity and quality of its connections. It accounts for connections’ centrality. High Eigenvector Centrality involves connections with other highly central nodes, amplifying an account’s significance (Riveni 2022). Accounts exhibiting high Eigenvector Centrality meet at least one of the following criteria: they have many connections, they are connected to important neighbours with high centrality or both (Tables 2 and 3).

Table 2 Type and count of instances (Sabo et al. 2024)
Table 3 Instance and count (Sabo et al. 2024)

4.2.1 Centrality analysis results

Through an evaluation of the results from these centrality measures, we determined that a group of six accounts matched consistently across all three centralities, indicating their importance within the network. They were present in the top 20 accounts across all centrality measures. The frequency of these accounts across all three centrality measures acknowledges their significance within the mastodon.social network, and signifies considerable influence. Figure 3, shows the results of the centrality metrics. The six accounts that match across all centralities are labelled as ’account number 1’ through ’account number 6’. In order to protect the privacy of an account and adhere to the imposed security guidelines, as explained in section 3.2, the original user IDs of the accounts within the top 20 of all centrality metrics are not shown.

4.3 Community analysis using the Louvain Algorithm

We conducted a community detection analysis, utilizing the Louvain algorithm. The Louvain algorithm was developed for discovering communities in large networks with high modularity partitions. Additionally, it helps reveal the complete hierarchical community structure inherent in the network, providing different views in community detection (Blondel et al. 2008). The Louvain algorithm was applied to the account following the information dataset, which has a total of 864,588 accounts, from multiple Mastodon instances.

4.3.1 Community analysis results

The results of the community analysis with the Louvain algorithm, shown in Tables 2 and 3, concluded that there are 98 communities in the gathered data, which includes all the following accounts from the collected source users. Of the 98 communities, the community with the highest number of nodes has 236.477 nodes. Furthermore, the modularity of the community analysts is 0.58, indicating that mastodon.social has a strong community structure, where each community is well-separated from the others. Apart from this, many communities have below 100 nodes, and the smallest ones have 2 nodes. To gain more insight into these communities a filter was applied based on several nodes, keeping only communities with more than 100 nodes. The results show 37 communities. One interesting finding comes from our analysis of accounts that follow and are followed by accounts of users registered on mastodon.social. From an initial dataset of 87.210 accounts from mastodon.social, we found that users that are registered on it belong to communities pertaining to 18.847 unique instances based on following and followed by relationships/links. This is a fascinating result showing a considerable diversity of communities across instances. These results show that the users of mastodon.social do not restrict the following dynamics within the same instance, but they want to see posts and read opinions from users with different backgrounds on different instances. Within those 18.847 instances, all five instances from which we collected general account information appear within the top 15 most popular instances in the results of the community analysis based on following links.

Fig. 3
figure 3

Centrality metrics results. See also Sabo et al. (2024)

The six accounts that appear within the top 20 of all centrality metrics as explained in Sect. 4.2 were searched within the obtained communities. The account that has the highest values among all centrality metrics, referred to as account number 1, belongs to the largest community. Furthermore, four users, namely account number 3, 4, 5 and account number 6 all belong to the second largest community, which has 160.374 nodes. The user referred to as account number 2 belongs to the third largest community, which has 104.886 nodes. These accounts belong to well-known people, such as the founder of Mastodon, a famous book author, a well-known Star Trek actor, a Washington Post tech reporter and the official Mastodon account. These accounts are influential based on centralities and community analysis.

4.4 Following group-based analysis

This section illustrates the analysis of user groups present within the collected data from the mastodon.social platform, focusing specifically on the interaction patterns captured in the ’following’ account data. The dataset that was analyzed is the account following information dataset, which has a total of 864,588 accounts, from multiple Mastodon instances. Initially, the structure of the dataset represented directed graphs, where each edge (source_id and target_id pair) corresponds to a ’following’ relationship. It is important to note that in the original context, not all ’following’ relationships are mutual. Out of an aggregate of 8.36 million edges in the dataset, 718.824 edges were extracted that represent associated ’following’ relationships, yielding 359.412 pairs of nodes. This subset, accounting for approximately 8.6% of the total number of edges, is the primary focus of this section of the analysis.

4.4.1 Results of the analysis of groups

The dataset was transformed into multiple undirected graphs, each edge now signifying a mutual ’following’ relationship between two nodes. This transformation facilitates the examination of ’cliques’ within the collected data. The clique algorithm was applied only to undirected graphs with more than 2 nodes. These groups are characterized by a high density of internal connections, that are often of significant interest in the exploration of social dynamics, influence spread, and community detection. The results are shown in Table 4

Table 4 Summary of group analysis of mastodon.social. See also Sabo et al. (2024)

Moreover, the top 20 nodes manifesting the highest frequency within the identified cliques were further isolated. The node exhibiting the greatest frequency appears in 1.338.364 cliques, which constitutes 63% of all discovered cliques and the node occupying the 20th position in this ranking appears in 483.960 cliques, corresponding to 23% of all cliques. A particularly fascinating discovery within this set of top 20 nodes is that all nodes represent users from a specific country who are active on mastodon.social. However, none of these users appears in the maximal clique of 27 nodes identified in the analysis. According to Netblock, a globally recognised internet monitor operating at the intersection of digital rights, cybersecurity, and internet governance, several social media platforms, such as Twitter, Facebook, and Instagram Netblock.org (2019) were limited at the specific country point when we saw this peak. Upon conducting more research, there have been found no indications of any such restrictions imposed on the use of Mastodon. These findings suggest that a substantial number of tightly-knit user groups from the specific country have been active on mastodon.social until the spring of 2020. It is highly plausible that these groups migrated from the aforementioned restricted platforms, leveraging mastodon.social as an alternative platform. We do not want to reveal the specific country, for security reasons, but it is clear that at the point we saw the spike, the use of other platforms was restricted.

4.5 Individual-account instance analysis

Lastly, we conducted an analysis of single-user instances. In total, we have retrieved 4908 instances that have only one user. We collected data regarding instance creation and we found out that out of these 4908 instances, 1959 instances were created in November 2022.

5 User adoption on Mastodon based on activity

In this section, we want to explore the adoption of users across 5 instances in which we collected data from mastodon.social, mastodon.cloud, mstdn.social, mastodon.online and mastodon.world. Here we use the dataset from Sabo et al. (2024). To investigate the user adoption rate across five Mastodon instances, we investigated how long users have been active on the platform using the existing dataset. The aim was to further study the migration from Twitter to Mastodon with an analysis of their lack of recent user activity, even though there is a spike in account registration. The results are visualized in Figs. 4, 5 and 6. Notably, a majority of users never post, resulting in a y-value of 0. Additionally, local peaks around 200 and 400 days of activity can be observed across all instances. While we have no specific theories, the 400-day peak could be explained by users creating Mastodon accounts around the time of Twitter’s acquisition, which was approximately 420 days before this study. The data represents a lower bound of user activity spans. Users can be active on social networks without posting, which our measurement approach does not account for. These "lurkers" who only consume content are portrayed as inactive users, explaining the large volume of users with a 0-day activity span across all instances.

Fig. 4
figure 4

User adoption for Mastodon Cloud and Mastodon Social

Fig. 5
figure 5

User adoption for Mastodon Online and Mstdn Social

Fig. 6
figure 6

User adoption for Mastodon World

6 Country-level analysis

In this section, we investigate Mastodon instances based on location, as well as adoption dynamics based on dedicated topics.

6.1 Analysis of Turkish Mastodon accounts

Following our discovery of users from a specific country described in Sect. 4.4, we decided to further pursue detecting groups of users from a certain country around a specific event. Thus, we looked into users from Turkey following the earthquake that occurred in February 2023. We retrieved 120 Turkish users over six instances, including the five initial instances and mastodon.com.tr and mastoturk.org. Indeed, we found that there was a number of newly registered Turkish users on Mastodon, shown in Fig. 7. Firstly, there is the November surge in new users, because of the takeover of Twitter, followed by a small increase in registered users in February 2023, right after the earthquake.

Fig. 7
figure 7

Monthly evolution of the number of newly registered users from Turkey. See also Sabo et al. (2024)

Moreover, we retrieved more data about Turkish users from the previously mentioned Turkish instance. We used mastodon.social as a baseline for comparison. By scrutinizing registration dates grouped by month, we determined the impact of significant events on Mastodon adoption (Mapperfr 2023). We found that the shift in user registration and instance creation in October 2022 is evident, as depicted in the academic dataset’s bar chart from Figs. 8 and 9. Similarly, the Turkish dataset highlighted a notable uptick in registrations in February 2023, as seen in Fig. 10, coinciding with the earthquake. These results show some spikes in account registration based on events. The results are not enough to generalise the results as more study on multiple instances is necessary, but it gives a base for assumptions, and it is a slight indicator that events are one of the indicators for people to register on Mastodon.

Fig. 8
figure 8

Creation of academic instances

Fig. 9
figure 9

User registration on academic instances

Fig. 10
figure 10

User registration on Turkish instances

6.2 Country-specific comparison between instances

We did a second round of data collection to retrieve users from certain countries. We started by running our data collection program on new instances and we retrieved information from users from the USA, Canada, United Kingdom, Germany, Italy and the Netherlands. This data was collected between 25th December and 30th December 2023. In Table 5, we showcase the results of data collection from country-specific instances. The type of data collected is general account information. Furthermore, we gathered accounts from these countries from the initial dataset, by creating a new program that extracts certain keywords from the user information of the 5 instances previously analyzed, so that we could collect data from users of specific countries within the data that was gathered. In Table 6 we represent the results of the extraction of users from specific countries from the initial 5 instances.

Table 5 Summary of data collected from country-specific instances
Table 6 Summary of account following information collected from the Canadian and Dutch instances

We chose to compare only Canada and the Netherlands due to the excessive diversity in user demographics in US and UK instances, as they are both English-speaking countries that would attract non-native people to their instances. Moreover, the US instance is focused on the San Francisco Bay Area, meaning that it would be a very small sample of users from the US. Lastly, the significantly larger user base in German and Italian instances made them less comparable to the Canadian instance. For this purpose, we retrieve that account following information for the two instances in the Table 6.

6.3 Centrality analysis

We run the centrality analysis algorithms on both the Dutch and Canadian instances (Sabo et al. 2024).

From the findings, we find out that all accounts within the Dutch and Canadian top 20 centralities are from the respective country’s instance. In Figs. 11 and 12, we looked at accounts that are matched across the 3 centrality metrics: degree, closeness and eigenvector and we found 6 Canadian accounts and 9 Dutch accounts. These findings result in the Dutch server having a larger central group of influential accounts than the Canadian instance.

Fig. 11
figure 11

Canadian centrality metrics results

Fig. 12
figure 12

Dutch centrality metrics results

6.4 Community analysis

From Table 7, we can observe that the Canadian instance has communities with large numbers of users compared to the Dutch instance. However, we can see that the Dutch instance has more tightly-knit communities, with less amount of people. From this observation, we can formulate that people from the Netherlands use Mastodon with the purpose of socializing between small communities, whereas Canadian users have large communities, thus having broad social networks.

Table 7 Comparison of Communities between Canada and Netherlands

In addition, we explore the diversity between communities within the two instances. From Tables 8, 9, 10 and 11, we find that Dutch communities have fewer total unique instances among their communities, but more national instances than the Canadian communities. An interesting observation is the fact that the most prominent instance in both communities is mastodon.social. We added the top 10 largest instances found within the communities of each nation plus the instance we are comparing it with, and we see that from 10 instances, 6 are common between the two nations.

Table 8 Canadian communities type and count
Table 9 Canadian communities instances and count
Table 10 Dutch communities type and count
Table 11 Dutch communities instances and Count

Regarding the academic data set, we saw that there was a massive spike in both user registration and instance creation in November 2022. In terms of academic instance creation, the reaction was even faster than for the user registration, as even in October 2022, the spike is already much more present than in the user registration. This is logical because when an instance is created, it takes some time for the users to find and register on it; it will not happen immediately (Figs. 13 and 14).

Fig. 13
figure 13

User registration on instances related to activism

Fig. 14
figure 14

User registration on political instances

In the activism and political datasets, we can see again that the acquisition of Twitter overshadows any other influence on Mastodon’s adoption. There are no other significant peaks different from the academic dataset. However, we can observe that in the activism dataset, people were quicker to respond to the takeover. Furthermore, we can not generalize this result as many more servers are needed to be studied to do that. Our results, nevertheless show the same dynamics for the number of instances that we analysed.

7 Limitations

Our limitations include the possibility that users may have deleted their accounts or stopped using Mastodon after our data collection period. We only assessed user activity during the timeframe of our data collection and did not track it afterwards. For country-specific instances, we cannot be certain that every user is from that country or speaks the language, as we assume users have some relationship to the country if they are registered on related instances. Additionally, we did not collect data on all users from each instance, but we have a considerably large dataset.

8 Conclusion and future work

The main objective of this study is to conduct a comprehensive study of Mastodon and its account growth. mastodon.social is the largest instance within Mastodon and provides the most comprehensive information among all instances. Although the gathered data does not encapsulate all accounts within an instance, its comparison with the information available on The Federation, a website for gathering statistics about nodes in the fediverse, matches the most important result, which is that instances from Mastodon encountered an increase in the number of newly registered users in November 2022. Upon evaluating the communities within multiple mastodon instances, we found that the accounts from a single instance interacted with accounts on more than 18.000 unique instances. Similar metrics were applied to country-specific instances, where we analyzed Canadian, Dutch and Turkish instances. These results confirm the significant diversity in following relationships/links across instances in a decentralized network. The study of groups within mastodon.social revealed millions of groups of users within less than 10% of the gathered data. This indicates that there are multiple closely connected groups of people that use Mastodon to network among themselves. Furthermore, our analysis determined that Mastodon is seen as a platform that provides a safe online environment in terms of freedom of expression and freedom from non-intervention. One topic that we are set to explore in future work is studying if eco-chambers exist in Mastodon and what are the properties on which they are formed if they exist. Another promising future work project is to perform a cross-referencing analysis by collecting data from Twitter and Mastodon users, to examine group migration patterns between the platforms.