Introduction

The introduction of the quote tweet feature by Twitter in April 2015 marked a significant development in the platform’s functionality. While a conventional retweet merely reproduces the original tweet, serving as a symbol of agreement and endorsement between users involved [1], the quote tweet feature allows users to include their own commentary when sharing a tweet. Consequently, this feature has given rise to various novel applications, including the expression of opinions, public replies, and content forwarding [2]. Notably, owing to the perennial significance of the US presidential elections [3, 4], Twitter instituted a novel policy on October 9, 2020, advising users to abstain from mere retweeting and advocating instead for the utilization of quote tweets supplemented by individual perspectives. This policy remained in effect until December 16, 2020. Indeed, before the policy change, retweeting on Twitter was simple. With a single click, users could share a post with their followers. However, during the time policy was held, clicking the retweet button no longer automatically shared the post. Instead, Twitter prompted users to add their own thoughts or comments before sharing. This essentially created a “Quote Tweet.” This extra step was intended to encourage users to share more thoughtfully. Importantly, adding text to the quote tweet was optional. Users could still leave the comment section blank and share the post without any additional commentary. This option essentially replicated the old retweet functionality.

Significance of the research

This research holds significance in the realm of social media and communication studies, particularly in understanding the impact of policy interventions on user behavior. The significance can be delineated through various dimensions. First, the study provides a comprehensive evaluation of the effectiveness of Twitter’s quoting encouragement policy implemented during the 2020 U.S. presidential election. By employing a robust big data approach and sophisticated analytical methods, the research goes beyond anecdotal observations, offering a nuanced understanding of how such policies influence user engagement. This contribution is valuable for social media platforms seeking evidence-based insights into the outcomes of policy interventions, aiding in the refinement of existing policies and the formulation of new ones. Second, the findings offer actionable insights for social media policymakers and practitioners involved in the delicate task of shaping user behavior. Understanding the quantitative and qualitative effects of the policy shift allows for the optimization of future interventions, fostering more effective communication strategies on platforms like Twitter. Policymakers can leverage the identified user characteristics and behavioral patterns to tailor interventions that resonate with the diverse user base, thereby enhancing the impact of social media policies. Finally, the research enriches the theoretical landscape by applying the Motivation Crowding Theory, Theory of Planned Behavior (TPB), and Theory of Diffusion of Innovation (DOI) to the context of social media policy adoption. This interdisciplinary approach contributes to theoretical advancements, offering a framework that can be applied beyond the scope of this study. As theories from economics and psychology are employed to understand user behavior in the digital age, the research paves the way for cross-disciplinary collaborations and a more holistic comprehension of online interactions.

Research gap

Despite the existing body of literature on quoting behavior on Twitter, there is a conspicuous gap in addressing the unique policy implemented by Twitter from October 9 to December 16, 2020, encouraging users to quote instead of retweeting. Previous studies have explored the use of quotes in various contexts, such as political discourse and the spread of misinformation, but none have specifically examined the impact of a platform-wide policy shift promoting quoting behavior. In addition, while some studies have investigated user behaviors associated with quoting, retweeting, and other tweet types, there is a lack of a comprehensive framework that assesses the quantitative and qualitative effects of a specific policy intervention. The current study introduces a detailed evaluation framework, incorporating time series analysis, text analysis, and sentiment analysis, providing a nuanced understanding of the Twitter quoting encouragement policy’s impact on user engagement. Moreover, previous research has explored user characteristics in the context of social media engagement but has not specifically addressed how users' attributes may influence their response to a platform-wide policy change. The current study bridges this gap by investigating how factors like social network size and activity levels correlate with users’ adoption of the quoting encouragement policy. Finally, while some studies have assessed the immediate effects of policy interventions, there is a lack of research investigating the longitudinal impact after the withdrawal of such policies. The current study extends the temporal dimension by examining user behavior during the pre-, within-, and post-intervals, offering insights into the sustained effects and user adaptation following the cessation of the quoting encouragement policy. By addressing these research gaps, the current study seeks to provide a holistic examination of the quoting encouragement policy on Twitter, contributing valuable insights to the fields of social media studies, policy evaluation, and user behavior analysis.

Research objectives

This study aims to assess the effectiveness of the Twitter policy implemented from October 9 to December 16, 2020, which encouraged users to utilize the quote tweet feature instead of simple retweeting. Specifically, the research objectives are twofold: (1) to determine the adoption rate of this policy and evaluate its success, and (2) to identify user characteristics based on their reactions to this policy. The outcomes of this research contribute to both the evaluation of the Twitter policy and the development of more effective approaches to policymaking. Stier et al. [5] proposed a comprehensive framework comprising four phases for the policymaking process: agenda setting, policy formulation, policy implementation, and evaluation. According to this framework, the evaluation phase involves assessing the outcomes of the policy, considering the perspectives of stakeholders involved in the previous phases. In this context, the present research examines the Twitter quoting encouragement policy, which represents an intervention in the daily usage patterns of Twitter users, through both quantitative and qualitative analyses. The quantitative effects analysis, particularly the achievements observed, provide valuable insights for evaluating the efficacy of the quoting encouragement policy by Twitter. Additionally, the results obtained from the qualitative analyses facilitate policy implementation, which refers to the process of translating policies into practical action under the guidance of an authoritative body.

Quantitative effects

In this section, we present the hypotheses formulated to assess the quantitative effects of the Twitter quoting encouragement policy. The hypotheses are as follows:

  • H1: The intervention is expected to have a negative impact on users’ retweeting behavior. We hypothesize that the policy promoting the use of quote tweets instead of simple retweets will lead to a reduction in the frequency of retweeting among users.

  • H2: The intervention is unlikely to significantly affect other types of user behavior, such as posting original or reply tweets, as well as quotes. We anticipate that any observed changes in the rates of these tweet types would be of minimal magnitude and primarily influenced by factors unrelated to the intervention.

  • H3: The termination of the intervention is anticipated to have a positive effect on users' retweeting behavior. We hypothesize that the discontinuation of the policy encouraging quote tweets will result in an increase in users' retweeting activity.

  • H4: Similar to H2, the conclusion of the intervention is not expected to impact other tweet types (excluding quotes) in terms of posting behavior. This suggests the presence of a prevailing opinion inertia, where users tend to maintain their existing patterns and tendencies when posting original, reply, and non-quote tweets.

These hypotheses serve as a foundation for analyzing the quantitative effects of the Twitter quoting encouragement policy and investigating its influence on users’ tweet behaviors. Through rigorous analysis, we aim to shed light on the impact of the intervention and its implications for user engagement on the platform.

Qualitative effects

The qualitative effects can be examined from two distinct perspectives: User Characteristics and Text Characteristics. Moreover, the analysis encompasses three intervals, namely the Pre-Interval (prior to the policy implementation), Within Interval (during the policy implementation), and Post-Interval (after the policy withdrawal). The hypotheses for each perspective are as follows:

User characteristics

  • H5: Users with a larger social network (i.e., more friends) are expected to exhibit a lesser increase in their quoting rate during the Within Interval.

  • H6: Users who demonstrate a regular pattern of activity, characterized by a lower frequency of overall Twitter engagement (such as publishing at least one tweet type on more days), are more inclined to experience an elevation in their quoting rate during the Within Interval.

  • H7: Users who engage in a higher volume of retweeting activities during the Pre-Interval are more likely to observe an increase in their quoting rate during the Within Interval.

  • H8: The swifter users experience an increase in their quoting rate during the Within Interval, the sooner they are likely to discontinue quoting tweets upon entering the Post-Interval.

Text characteristics

  • H9: Short-term quoters tend to exhibit a comparatively smaller change in the length of their quote texts compared to long-term quoters. This is primarily due to the involuntary nature of the former, whereas the latter are more intentionally created.

  • H10: The sentiment of quote texts from short-term quoters is generally more likely to elicit a greater range of emotions compared to those from long-term quoters. This difference is attributable, at least in part, to the intervention's influence on short-term quoters.

  • H11: The quote texts of short-term quoters are generally more prone to receiving a higher number of retweets compared to those of long-term quoters. This can be attributed to factors such as longer text length, less deliberative content, and the presence of heightened emotional elements in the latter.

These hypotheses form the basis for analyzing the qualitative effects of the Twitter quoting encouragement policy, enabling a comprehensive understanding of user and text characteristics during different intervals. By examining these effects, we aim to shed light on the nuanced dynamics that underlie users’ quoting behavior and its implications on social interaction and engagement within the Twitter platform.

Theoretical framework

In alignment with the two main parts of this research, which examine the quantitative and qualitative effects of the recent Twitter policy, the theoretical framework is also divided into two contexts: one for quantitative analysis and the other for investigating qualitative effects. For the quantitative analyses, the motivation crowding theory has played a central role in shaping the corresponding hypotheses. This theory suggests that providing extrinsic incentives for specific behavior can sometimes undermine intrinsic motivation to engage in that behavior [6]. Although the motivation crowding theory originated in the realm of economics [7], this study aims to apply it to the adoption of policies within the context of Twitter. By treating the quoting encouragement policy as an incentive, this research seeks to quantify the impact of this incentive during its implementation and withdrawal. Hypotheses 1–4 have been formulated to guide these quantitative analyses and explore the potential influence of the undermining effect on the adoption rate after the policy withdrawal.

Regarding the qualitative analyses, the TPB and the DOI serve as foundational frameworks for developing hypotheses related to user and text characteristics. The TPB explains behavior based on individuals' beliefs through three key components: attitude, subjective norms, and perceived behavioral control, which collectively shape behavioral intentions. Drawing on the TPB, hypotheses 5–8 aim to characterize different users based on their behaviors and attitudes toward the new policy. The DOI provides a platform for distinguishing users based on the time of adoption. In line with this theory, hypotheses 9–11 have been formulated to address characteristics that facilitate early adoption based on the content of quote texts. Figure 1 illustrates the theoretical framework of this study, highlighting its key components.

Fig. 1
figure 1

Theoretical framework

Uniqueness and generalizability

To the best of our knowledge, this research represents the first comprehensive study to investigate the impact of the quoting encouragement policy implemented by Twitter. In comparison to the limited existing studies that have examined Twitter policies in the past, this research distinguishes itself through both the scale of the dataset utilized and the breadth of the analyses conducted. These unique aspects contribute to the applicability of this study in two key areas: methodology and findings. In terms of methodology, the presented approach incorporates an interrupted time series analysis framework, coupled with text and sentiment analyses, to examine policy interventions on a large scale. This framework enables researchers to develop various approaches for analyzing interventions within the realm of social media and big data. With regards to the findings, the extraction of qualitative and quantitative patterns from such a vast dataset yields novel insights. Particularly noteworthy is the ability to juxtapose these macro and micro results, leading to a deeper understanding of the policy’s effects. The findings of this study hold potential value for practitioners and policymakers not only on Twitter but also on other global platforms like Instagram and YouTube. However, it is important to consider certain modifications, such as adapting the follower-to-following ratio, when applying these findings to undirected networks like Facebook, where mutual agreement is necessary for link creation. Moreover, the analysis of this policy, which was implemented during the presidential election, provides valuable insights into its potential impact on public attention. Public attention has recently been identified as a critical factor in the success of presidential candidates [8]. Therefore, understanding the effects of the quoting encouragement policy can contribute to a better understanding of the dynamics surrounding public attention during such critical periods. Indeed, the uniqueness of this research lies in its pioneering examination of the Twitter quoting encouragement policy, extensive dataset, and comprehensive analyses. These distinct features enhance the applicability of the research in terms of methodology and findings, with potential implications for other global platforms and the study of public attention in political contexts.

Literature review

Given the nature of this research, which focuses on a novel Twitter policy that promotes quoting instead of retweeting, the literature review examines three perspectives: (1) Quote, (2) Engagement, and (3) Hashtag Adoption. These perspectives encompass relevant aspects that align with the scope of this study.

Quote

Garimella et al. [2] conducted a study on the utilization of the newly introduced “quote RT” feature on Twitter, specifically examining its role in political discourse and the sharing of political opinions within the broader social network. Their findings indicated that users who were more socially connected and had a longer history on Twitter were more likely to employ quote RTs. Furthermore, they observed that quotes facilitated the dissemination of political discourse beyond its original source. In a different context, Jang et al. [9] employed the rate of quotes as a measure to identify and detect fake news on Twitter. Their research focused on leveraging quotes as a means of analyzing the spread of misinformation and distinguishing it from authentic news. Li et al. [10] tried to identify users with high dissemination capability under different topics. Additionally, Bodaghi et al. [11] investigated the characteristics of users involved in the propagation of fake news, considering quotes and their combined usage with other tweet types such as retweets and replies. Their analysis aimed to gain insights into the user behaviors associated with the dissemination of false information. South et al. [12] utilized the quoter model, which mimics the information generation process of social media accounts, to evaluate the reliability and resilience of information flow metrics within a news–network ecosystem. This study focused on assessing the validity of these metrics in capturing the dynamics between news outlets engaged in a similar information dissemination process. By reviewing these studies, we can identify their relevance to the understanding of quoting behavior and its implications within different contexts, such as political discourse and the spread of misinformation. However, it is important to note that these previous works primarily focused on the usage of quotes and their effects without specifically addressing the Twitter policy under investigation in this study.

Engagement

The concept of engagement on social media platforms, particularly in relation to political communication and online interactions, has been extensively explored in previous studies. Boulianne et al. [13] conducted research on the engagement rate with candidates’ posts on social media and observed that attack posts tend to receive higher levels of engagement, while tagging is associated with a general trend of lower engagement. Lazarus et al. [14] focused on President Trump’s tweets and found that engagement levels vary depending on the substantive content of the tweet, with negatively toned tweets and tweets involving foreign policy receiving higher engagement compared to other types of tweets. Yue et al. [15] delved into how nonprofit executives in the U.S. engage with online audiences through various communication strategies and tactics. Ahmed et al. [16] examined Twitter political campaigns during the 2014 Indian general election. Bodaghi et al. [17] conducted a longitudinal analysis on Olympic gold medalists on Instagram, investigating their characteristics as well as the rate of engagement they receive from their followers. Hou et al. [18] studied the engagement differences between scholars and non-scholars on Twitter. Hoang et al. [19] aimed at predicting whether a post is going to be forwarded or not. Munoz et al. [20] proposed an index as a tool to measure engagement based on the tweet and follower approach.

The decision of an online social network user to join a discussion group is not solely influenced by the number of friends who are already members of the group. Backstrom et al. [21] discovered that factors such as the relationships between friends within the group and the level of activity in the group also play a significant role in the user’s decision. Hu et al. [22] performed an empirical study on Sina Weibo to understand the selectivity of retweeting behaviors. Moreover, Balestrucci et al. [23] studied how credulous users engage with social media content. Bodaghi et al. [24] explored the impact of dissenting opinions on the engagement rate during the process of information spreading on Twitter. Wells et al. [25] examined the interactions between candidate communications, social media, partisan media, and news media during the 2015–2016 American presidential primary elections. They found that social media activity, particularly in the form of retweets of candidate posts, significantly influenced news media coverage of specific candidates. Yang et al. [26] investigated the tweet features that trigger customer engagement and found a positive correlation between the rate of quoting and the number of positive quotes. Bodaghi et al. [27] studied the role of users’ position in Twitter graphs in their engagement with viral tweets. They demonstrated how different patterns of engagement can arise from various forms of graph structures, leading to the development of open-source software for characterizing spreaders [28, 29].

Hashtag adoption

The adoption and usage of hashtags on Twitter have been investigated in several studies, shedding light on the factors influencing individual behavior and the role of social networks. Zhang et al. [30] explored the behavior of Twitter users in adopting hashtags and specifically focused on the effect of “structure diversity” on individual behavior. Their findings suggest that users' behavior in online social networks is not solely influenced by their friends but is also significantly affected by the number of groups to which these friends belong. Tian et al. [31] investigated the impact of preferred behaviors among a heterogeneous population on social propagation within multiplex-weighted networks. Their research shed light on the diverse adoption behaviors exhibited by individuals with varying personalities in real-world scenarios. Examining hashtag use on Twitter, Monster et al. [32] examined how social network size influences people's likelihood of adopting novel labels. They found that individuals who follow fewer users tend to use a larger number of unique hashtags to refer to events, indicating greater malleability and variability in hashtag use. Rathnayake [33] sought to conceptualize networking events from a platform-oriented view of media events, emphasizing the role of hashtags in bottom-up construction. Hashtags played a key role in this taxonomy, reflecting their significance in organizing and categorizing discussions around specific events. Furthermore, Bodaghi et al. [34] demonstrated that the size of a user's friend network also impacts broader aspects, such as their decision to participate in an information-spreading process. The characteristics and dynamics of an individual’s social network play a role in shaping their behavior and engagement with hashtags. These studies collectively contribute to our understanding of hashtag adoption and its relationship to social networks, providing insights into the factors that influence individuals’ decisions to adopt and use hashtags in online platforms like Twitter.

Method and analysis

Data collection

For this study, a random sample of 86,334 users from the United States was selected. The data collection process involved crawling their tweets, specifically the last 3200 tweets if available, until October 2020. The crawling process continued for these users at seven additional time intervals until February 2021. This resulted in a total of eight waves of data, encompassing all the tweets from these 86,334 users starting from the 3200th tweet prior to the first crawling time in October 2020, up until their last tweet on February 2, 2021. The eight waves of crawled data were then merged into a final dataset, and any overlapping tweets were removed. The final dataset consists of a data frame containing 304,602,173 unique tweets from the 86,334 users. Each tweet in the dataset is associated with 23 features, resulting in a dataset size exceeding 31 GB. Additionally, another dataset was created by crawling the user characteristics of these 86,334 users, such as the number of followers, friends, and statuses. The dataset includes four types of tweets: Retweet, Quote, Reply, and Original. Each tweet in the dataset belongs to only one of these types (pure mode) or a combination of types (hybrid mode). The hybrid modes are represented in two forms: (1) a retweet of a quote and (2) a reply that contains a quote. To maintain consistency and focus on pure modes in the dataset, the former was considered solely as a retweet, and the latter was treated as a quote only. As a result, the approximate counts of the four tweet types (Retweet, Quote, Reply, and Original) in the dataset are 143 M, 23 M, 77 M, and 61 M, respectively. To ensure a more recent focus on activities, the analysis specifically considered data from October 9, 2019, onwards. This date, October 9, 2019, was chosen as it is one year prior to Twitter’s issuance of the quoting encouragement policy. By using this cutoff date, the analysis concentrates on the data relevant to the policy's implementation and subsequent effects.

Data exploration

This section explores three aspects of the data: (1) the average number of tweets per user in each tweet type, (2) the number of active users in each tweet type, and (3) the usage of hashtags. The analysis includes all 86,334 users in the dataset. The exploration is conducted across three intervals: (1) pre-interval (from October 9, 2019, to October 8, 2020), (2) within-interval (from October 9, 2020, to December 15, 2020), and (3) post-interval (from December 16, 2020, to February 2, 2021). The code used for these explorations is publicly available.Footnote 1 Figure 2 presents the results for the first two aspects. The plots on the left-hand side illustrate the average number of tweets published in each tweet type, namely Original, Quote, Reply, and Retweet. The plots on the right-hand side display the number of active users in each tweet type. Active users in a specific type on a given day are defined as users who have published at least one tweet in that type on that day.

Fig. 2
figure 2

Daily rates of user activities during pre-, within-, and post-intervals

To analyze the usage of hashtags, the first step is to identify political hashtags. This involves extracting all the hashtags used in the dataset from September 1, 2020, to February 1, 2021, excluding the last day of the dataset (February 2, 2021) due to incomplete data collection. The following intervals are defined based on this period:

  • Pre-Interval: September 1, 2020, to October 8, 2020.

  • Within-Interval: October 9, 2020, to December 15, 2020.

  • Post-Interval: December 16, 2020, to February 1, 2021.

The extraction process yields a total of 1,126,587 hashtags. From this set, the 100 most frequently used hashtags are selected for further analysis. These selected hashtags are then reviewed and annotated by two referees, considering their political context. Through consensus between the referees, 32 hashtags out of the initial 100 are identified as political. The results of the usage analysis on these selected political hashtags are presented in Fig. 3.

Fig. 3
figure 3

Usage of political hashtags. The left plot presents a word cloud depicting the 32 most frequently repeated political hashtags. The right plot displays the distribution of these political hashtags. The upper plot labels the significant dates associated with spikes in usage

Table 1 displays the key dates corresponding to the significant spikes observed in the plots depicted in Fig. 3. These events directly influenced the patterns observed in the dataset.

Table 1 Significant dates of spikes in dataset plots

Measurements of quantitative effects

To perform quantitative analysis, the data frame of each user was extracted by segregating all tweets associated with the same user ID. This process resulted in the creation of 86,334 individual data frames, each corresponding to a unique user. Subsequently, each user's data frame was divided into three distinct time intervals as follows:

  • Pre Interval [2019-10-09 to 2020-10-08]: This interval encompasses the year prior to the implementation of the new Twitter policy on 2020-10-09. Hence, the end of this interval is set as 2020-10-08.

  • Within Interval [2020-10-09 to 2020-12-15]: This interval spans from the policy’s inception on the first day, i.e., 2020-10-09, until its termination by Twitter on the last day, i.e., 2020-12-15.

  • Post Interval [2020-12-16 to 2021-02-02]: This interval commences on the day immediately following the removal of the policy, i.e., 2020-12-16, and continues until the last day on which a user published a tweet within the dataset. The dataset's coverage concludes on 2021-02-02, which represents the latest possible end date for this interval if a user had any tweet published on that date.

Impact analysis of the Twitter policy

The objective of this analysis is to assess the individual impact of the new Twitter policy, which promotes quoting instead of retweeting, on each user. Specifically, we aim to examine how the rate and quantity of published tweets per day have been altered following the implementation or removal of the new policy. Figure 4 illustrates the slopes and levels of a selected tweet type (quote) within each interval for a given user. Given the presence of four tweet types and three intervals, it is necessary to fit a total of 12 models for each user, corresponding to each tweet type within each interval.

Fig. 4
figure 4

Slope and levels of number of quotes published by users during three intervals. This figure displays the slope and levels of the three intervals (pre-, within-, and post-intervals) for the number of quotes published by each user. The green lines depict the linear regression of the time series for each interval. The slope of pre-interval, within-interval, and post-interval corresponds to the slope of AB, CD, and EF lines, respectively. The start/end levels of pre-interval, within-interval, and post-interval are represented by A/B, C/D, and E/F, respectively

Model

To analyze the impact of the new policy for each tweet type within a specific interval, we applied linear regression using the Ordinary Least Squares method (Eq. 1) in Python for users who had at least 7 data points with non-zero values.

$$y=\beta + \alpha x + \varepsilon$$
(1)

where y is the number of tweets per day, x is the number of days, \(\alpha\) is the coefficient representing the slope, \(\varepsilon\) is the error, and \(\beta\) is the level. We then checked for the presence of autocorrelation in the residuals using the Durbin–Watson test (Eq. 2). If no autocorrelation was detected, we used linear regression to calculate the slopes and levels.

$$d=\frac{{\sum }_{i=2}^{n}{({e}_{i}-{e}_{i-1})}^{2}}{{\sum }_{i=1}^{n}{e}_{i}^{2}}$$
(2)

where d is the Durbin–Watson statistic, \({{\text{e}}}_{{\text{i}}}\) is the residual at observation i, n is the number of observations. The Durbin–Watson statistic ranges from 0 to 4. A value around 2 indicates no autocorrelation, while values significantly less than 2 suggest positive autocorrelation, and values significantly greater than 2 suggest negative autocorrelation. However, if autocorrelation was present, we employed linear regression with autoregressive errors (Eq. 3).

$$y=ax+ \delta$$
(3)

where \(\delta = {\widehat{{\varnothing }}}_{1}{\delta }_{i-1}+ {\widehat{{\varnothing }}}_{2}{\delta }_{i-2}+\dots + {\widehat{{\varnothing }}}_{p}{\delta }_{i-p}-{\widehat{\theta }}_{1}{e}_{i-1}- {\widehat{\theta }}_{2}{e}_{i-2}-\dots -{\widehat{\theta }}_{q}{e}_{i-q}+{\varepsilon }_{i}\)

In this equation, the errors are modelled using an ARIMA (p, d, q), where p and q represent the lags in the autoregressive (AR) and moving-average (MA) models, respectively, and d is the differencing value. We utilized the SARIMAX (p, d, q) model in Python to implement this regression, where the exogenous variable X (in Eq. 1) represents the number of days.

To determine the best values for the model's parameters, we conducted a grid search to generate a set of potential parameter combinations. We then evaluated the results for each combination based on the following criteria: (1) All achieved coefficients must be significant, (2) Akaike Information Criterion (AIC) based on Eq. (4) should be less than 5000, and (3) The Ljung–Box coefficient based on Eq. (5) should be significant (> 0.05).

$$AIC = -2 \times \mathcal{L}+2\times k$$
(4)

where \(\mathcal{L}\) is the maximum log-likelihood of the model, k is the number of estimated parameters. A lower AIC value indicates a better estimation of the model orders.

$$Q(h)=n(n+2){\sum }_{k=1}^{h}\frac{{\rho }_{k}^{2}}{n-k}$$
(5)

where n is the sample size, \({\rho }_{k}\) is the sample autocorrelation at lag k. The test statistic follows a chi-squared distribution with degrees of freedom equal to the number of lags considered. The null hypothesis is that there is no autocorrelation up to the specified lag. A p-value greater than 0.05 suggests that there is no significant autocorrelation in the residuals, indicating an adequate fit. Finally, among the selected results, the model with the lowest sigma2 shown by Eq. (6), indicating less variance in the residuals, was chosen as the best-fit model.

$${\sigma }^{2}=\frac{\sum_{i=1}^{n}{({e}_{i}-\overline{e })}^{2}}{n-1}$$
(6)

where \({e}_{i}\) is the residual at observation i, \(\overline{e }\) is the mean of the residuals, and n is the number of observations. In the case of time series analysis, the residuals are the differences between the observed values and the values predicted by the ARIMA or SARIMAX model. The parameter values corresponding to this model were considered the optimal fit. The entire process for obtaining the slope and level findings is depicted in Fig. 5, and the results are presented in Table 2.

Fig. 5
figure 5

Flowchart illustrating the overall analysis procedure for slope and level assessments

Table 2 Changes in tweeting trends between intervals

Table 2 illustrates variations in the slope and level of tweeting between intervals. For instance, a level change of − 1.025 indicates a daily decrease of approximately 1.025 quotes from pre-interval to within-interval. Similarly, a slope change of 0.003 reflects an increase of around 0.003 quotes per day in the slope of quoting during the same transition. The table provides additional insights into slope and level changes for other tweet types across different intervals.

Analysis of qualitative effects

User characteristics

In this section, we aim to investigate the changes in user behavior towards the Twitter policy based on user characteristics such as the number of followers, number of friends, and number of statuses. To achieve this, we consider users whose obtained models are significant in both paired intervals (pre-within or within-post). We calculate the correlations between the values of these characteristics and the rate of change in the slope of each tweet type between the intervals. The results of this analysis are presented in Table 3.

Table 3 Pearson correlations between user characteristics and slope change in each tweet type and each intervals

For instance based on Table 3, investigating the relationships, a notable negative correlation of − 0.042 is observed between the number of friends a user has and the rate of slope change for quote publishing, specifically from the pre-interval to the within-interval. Additionally, a significant negative correlation of − 0.079 is evident between the number of quotes published in the post-interval and the number of retweets previously published in the within-interval. Further detailed explanations and implications are presented in the “Results” section.

Text characteristics

The analysis of text characteristics focuses on examining the impact of the new policy on the length and sentiment of quote texts. Specifically, we are interested in understanding how the quote texts of two different user groups, namely “short-term quoters” and “long-term quoters,” have changed in terms of length and sentiment from the pre-interval to the within-interval. We define the two groups as follows:

  • Short-term Quoter: A user who did not engage in quoting during the pre-interval but started quoting in the within interval.

  • Long-term Quoter: A user who engaged in quoting during the pre-interval and continued to do so in the within interval. A quoter is defined as a user whose average number of quotes in all tweets exceeds a certain threshold.

For the analysis, we extract three characteristics from the quote text: (1) the number of characters (excluding spaces), (2) the sentiment score, and (3) the number of times the quote has been retweeted. We preprocess the text by performing tasks such as removing non-ASCII characters, emojis, mentions, and hashtags. To calculate the sentiment score, we utilize the sentiment analyzer from the Python NLTK package, which is based on VADER,Footnote 2 a lexicon, and rule-based sentiment analysis tool specifically designed for sentiments expressed in social media. The sentiment score calculated by VADER is a compound score that represents the overall sentiment of a text. The score is computed based on the valence (positivity or negativity) of individual words in the text (Eq. 7).

$${S}_{compound}=\frac{\sum V\times I}{\sum I}$$
(7)

where \({S}_{compound}\) is the compound sentiment score, V is the valence score of word and is normalized to be between − 1 (most negative) and 1 (most positive), and I is the intensity of word. The weights are determined by the intensity of each word's sentiment. The rate of change in the average value of these characteristics from the pre to within intervals is then calculated for each user. Finally, we compute the average rates of change separately for short-term and long-term quoters, as presented in Table 4.

Table 4 Comparative analysis of text characteristics between short-term and long-term quoters

As illustrated in Table 4, 34,317 users in the dataset exhibited a quote-publishing rate exceeding 0.05 during the pre and within intervals, indicating more than 5 quotes per every 100 published tweets. These users observed a marginal increase (0.006) in the average sentiment of their tweets from pre-intervals to within-intervals. Conversely, 5900 users in the dataset, who had no quotes in the pre-interval but exceeded 0.01 of all their tweets as quotes during the pre and within intervals, experienced a decrease of 0.242 per day in their rate of retweeting from pre-intervals to within-intervals. Further detailed explanations and implications are presented in the “Results” section.

Results

Impact of the Twitter policy

The findings of the impact analysis are presented in Table 2, illustrating the changes in slopes for different tweet types. It is observed that the slope of each tweet type, except for quotes, decreased upon entering the within interval, while quotes experienced a slight increase (0.0031). Notably, prior to the implementation of the new policy, there was a substantial increase in the number of daily tweets across all types. Therefore, the decline in levels during the within intervals relative to the pre-intervals can be attributed to this initial surge in activity. Another significant result is the considerable decrease in the number of daily published quotes during the post-interval compared to the within-interval. Additionally, a significant decrease (− 2.785) and increase (11.587) are observed in the slope of retweets per day during the within and post intervals, respectively. These notable changes in both retweet and quote rates highlight the impact of Twitter's new policy. When examining these results from a broader perspective, two trends emerge: (1) during the transition from the pre-interval to the within interval, the slope of all tweet types, except for quotes, decreased, and (2) from the within-interval to the post-interval, the slope of all tweet types increased, except for quotes. These trends underscore the pronounced impact of the new policy implemented by Twitter. In conclusion, it can be inferred that the policy has achieved some progress. However, determining the true success of the policy requires considering Twitter's overarching goals within a broader context, encompassing both short-term and long-term consequences.

User characteristics

The correlations between user characteristics and slope changes in each tweet type during different intervals are presented in Table 3. The results, particularly the correlations between slope changes in quoting and retweeting and other user characteristics, can be examined from three perspectives: the pre-within transition, within-post transition, and a comparison of pre-within to post-within.

Pre-within transitions

Regarding the pre-within transition, several noteworthy relationships can be observed. Firstly, there is an inverse relationship between the number of friends a user has and the slope change for the quote type. This suggests that users with a larger number of friends exhibit less improvement in their quoting rate during the within-interval (following the implementation of the Twitter policy). Similarly, the number of statuses published by users also demonstrates a negative correlation with the slope change for quotes. In other words, users who tend to publish a higher number of statuses show less inclination to increase their quoting rate during the within interval. Additionally, significant relationships emerge between the slope change in quoting during the pre-within interval and both retweet counts and the number of data points. This indicates that users who have engaged in more retweets are more likely to exhibit a propensity for quoting during the within-interval. Similar relationships can be observed between the slope change in quoting during the pre-within interval and other tweet types, suggesting that more active users are more influenced by the changes in the quoting behavior.

Within-post transitions

Analyzing the within-post transitions, several significant relationships can be observed. Firstly, the slope change in retweeting during the within-post interval exhibits a significant relationship with the number of quotes and original tweets during the within interval. This implies that users who have a higher number of quotes and original tweets in their activity would experience a greater increase in the retweeting rate after the policy cancellation (post-interval). However, the slope change in retweeting during the within-post interval does not show a significant relationship with the slope change in any other tweet type, except for an inverse relationship with original tweets. In other words, users who engage more in original tweets during the within-interval are likely to exhibit a lower increase in the rate of retweeting during the post-interval. Regarding the slope change in quoting during the within-post interval, a significant negative relationship is observed with the number of retweets during the within interval. This indicates that users who have a higher number of retweets during the within-interval are likely to experience a lower increase in the quoting rate during the post-interval. This relationship holds true for users who have quoted more during the within interval as well.

Pre-within to within-post comparison

Comparing the slope change in quoting and retweeting between the pre-within and within-post transitions, it can be observed that users who experienced an increase in their quoting or retweeting rate during the pre-within transition tend to exhibit a higher inclination to decrease it during the within-post transition. Additionally, a significant inverse relationship is evident between the slope change in quoting during the pre-within interval and the slope change in retweeting during the within-post interval. This implies that users who witnessed a greater increase in their quoting rate during the pre-within transition are likely to experience a larger decrease in their retweeting rate during the within-post transition.

Text characteristics

The results of the text analysis, specifically length, sentiment, and the number of retweets, are presented in Table 4. Examining the results reveals several key findings. Firstly, the quote texts of long-term quoters have undergone a reduction in length during the within interval compared to the pre-interval, across all threshold levels. However, for short-term quoters, this reduction in quote length only occurs at threshold levels equal to or above 0.05. Furthermore, among those whose quote texts have been shortened (at threshold levels of 0.05 and 0.075), short-term quoters experience a greater reduction in length compared to long-term quoters. Regarding sentiment analysis, the results indicate an overall increase in the sentiment score of quote texts from the pre-interval to the within interval. However, this increase is more pronounced for short-term quoters compared to long-term quoters.

Additionally, for both categories and across all threshold levels, the number of retweets received by quotes has decreased from the pre-interval to the within-interval. This decrease is particularly significant for long-term quoters, except at threshold level 0 for short-term quoters. This observation aligns with expectations since short-term quoters did not have any quotes during the pre-interval, resulting in their sentiment score being subtracted from the sentiment scores of quotes during the within interval. Notably, the decrease in the number of retweets is more substantial for long-term quoters, except at a threshold level of 0.075, where it is slightly higher for short-term quoters. In summary, by considering a threshold of 0.075 as an indicator, we can conclude that the Twitter policy has influenced quote texts in the following ways: (1) There is a greater reduction in the number of characters for short-term quoters compared to long-term quoters, and (2) The increase in sentiment score is more significant for short-term quoters relative to long-term quoters.

Discussion

The findings pertaining to the hypotheses are outlined in Table 5.

Table 5 The hypotheses and results

Quantitative findings

The quantitative analysis, based on hypotheses H1–4, reveals that the intervention has a negative impact on users’ retweeting behavior, while other tweet types remain relatively unaffected. However, the cessation of the intervention leads to an increase in the retweeting rate and a decrease in the quoting rate. When considering only the period when the policy was in effect, namely the within-interval, it can be concluded that the policy was partially successful. Despite a minor increase in the quoting rate, the significant decline in retweeting indicates a positive outcome. However, when examining the long-term effects after discontinuation of the policy, i.e., the post-interval, the policy can be regarded as a failure, as the retweeting rate experienced a dramatic increase while the quoting rate decreased substantially. Although Twitter did not enforce users to quote instead of retweeting nor provide any explicit promotion or reward for quoting, the quoting encouragement policy may have influenced users' perceptions and served as a virtual external incentive for initiating quoting behavior. This phenomenon can be explained by the adaptive nature of the brain in perceiving rewards based on recent levels and ranges of rewards, fictive outcomes, social comparisons, and other relevant factors [35, 36]. The motivation crowding theory offers a framework for discussing this observation. When an extrinsic reward is removed, the level of intrinsic motivation diminishes compared to a scenario where no additional reward was initially provided [37]. In the case of Twitter's policy, users may have perceived the extrinsic incentive of adding a few extra characters to a retweet as rewarding and complied accordingly. However, once this external incentive was eliminated, the residual intrinsic motivation decreased below its initial level. This explains the subsequent decline in the quoting rate during the post-interval, accompanied by a surge in retweeting activity.

Qualitative findings

The qualitative analysis, focusing on hypotheses H5–11, reveals several noteworthy patterns. Users with a smaller number of friends and higher levels of overall tweet activity are more inclined to align with the policy and increase their quoting rate during the within interval. Furthermore, users who experienced an increase in their quoting rate during the within interval are more likely to decrease their quoting rate following the policy withdrawal in the post interval. Additionally, users who adopted quoting behavior as a result of the policy during the within interval demonstrated a tendency to publish quotes with shorter text length and more positive emotions. Two observed patterns can be explained respectively by the TPB and the DOI. The TPB posits that an individual’s behavioral intentions are influenced by three components, with subjective norms playing a significant role [38]. The impact of subjective norms is contingent upon the connections an individual has with others. Users with a smaller number of friends have fewer channels through which subjective norms can exert pressure. Consequently, these users are less influenced by societal norms that have not yet accommodated the new policy. Hence, users with fewer friends are more likely to be early adopters of the policy. Moreover, recent research [39] suggests that TPB, along with the theory of the Spiral of Silence, can potentially explain the avoidance of adoption, particularly when adoption involves expressing individual beliefs. Furthermore, the DOI provides insights into the adoption process, suggesting that adopters can be categorized into distinct groups based on the timing of their adoption [40]. Through this categorization, shared characteristics in terms of personality, socioeconomic status, and communication behaviors emerge. Early adopters, characterized by a greater tolerance for uncertainty and change, often exhibit higher levels of upward mobility within their social and socioeconomic contexts, as well as enhanced self-efficacy [41]. These characteristics are reflected in the more positive emotions expressed in their quote posts.

Implications

This study carries implications from both practical and theoretical perspectives. From a practical standpoint, the findings provide valuable guidance for practitioners in developing a multistage model that captures users’ behavior towards a new social media policy at an aggregate level. Such a model is crucial for designing efficient strategies aimed at expediting the adoption process among the majority of users. Leveraging the quantitative analysis method employed in this study, practitioners can first evaluate the impact of the policy, and then, using the qualitative analysis method, identify users who are more inclined to adopt or reject the policy based on their characteristics and text behavior. Gaining insights into user tendencies towards policy adoption or rejection in advance can inform a series of initiatives, including targeted user categorization to introduce or withhold the policy during its initial stages. An illustrative study by Xu et al. [42] explored public opinion on Twitter during Hurricane Irma across different stages, analyzing over 3.5 million tweets related to the disaster to discern distinct thematic patterns that emerged during each stage. Their findings assist practitioners in utilizing Twitter data to devise more effective strategies for crisis management. From a theoretical perspective, the findings contribute to the advancement of theories such as the TPB and the DOI in the realm of cyberspace. According to TPB, subjective norms play a significant role in shaping human behavior. This study revealed that users with a smaller number of friends are more inclined to accept the new policy. This suggests that users who have fewer connections are more likely to deviate from the prevailing norm in which the adoption of the new policy has not yet gained traction. Furthermore, the higher rates of positivity observed in the quote texts of short-term quoters, relative to their long-term counterparts, contribute to the extension of the Innovation Diffusion Theory regarding policy adoption and expand our understanding of the possible manifestations of early adopters' characteristics in the context of social media.

For a more nuanced understanding, it is noteworthy to explore the impact of events on user behavior. While events like debates can undeniably influence user activity levels, this impact is likely experienced across all types of users, such as quoters and retweeters. Our analysis, examining individual users across multiple time intervals that encompass these events, allows us to observe user-specific behavioral evolution. The extracted patterns thus represent dominant shifts in spreading behavior observed in the majority, irrespective of their original preference (retweeting or quoting). This observed consistency suggests that the policy's influence may extend beyond just event-driven fluctuations. The consistent shift in information-sharing behavior throughout the study period points towards the possible contribution of additional factors beyond isolated events.

Conclusion and future works

This research employed a big data approach to analyze the Twitter quoting encouragement policy, examining both its quantitative and qualitative effects. The research timeline was divided into three distinct intervals: pre, within, and post-intervals. Time series analysis was then utilized to identify changes in the rates of different tweet types across these intervals. Additionally, text and sentiment analysis, along with correlation methods, were applied to explore the relationships between user characteristics and their responses to the policy. The results revealed a short-term success followed by a long-term failure of the policy. Moreover, a set of user characteristics was identified, shedding light on their adherence to the policy and their quoting tendencies during the policy’s implementation. These findings have significant implications for the development and evaluation of new policies in the realm of social media, offering valuable insights for the design of more effective strategies.

The study of policy adoption on social media is still in its early stages, particularly in the realm of data analytics and behavioral research [43]. Future studies can build upon this research and explore additional factors and techniques to deepen our understanding. For example, the impact of aggregations, such as crowd emotional contagion, convergence behavior, and adaptive acceptance, can be modelled as exogenous factors in the analysis [44, 45]. Additionally, incorporating new techniques for sentiment analysis, as highlighted in studies by Zhao et al. [46], and Erkantarci et al. [47], as well as semantic techniques [48], can further enhance computational analyses. Moreover, future research can consider factors related to the continuance of use [49] to examine the reasons behind policy rejection by users who initially adopted it. The inclusion of census data, search logs of users [50], user demographics [51], and the analysis of interconnections within a graph [52] would be valuable additions to the analysis. These additional data sources can provide a more comprehensive understanding of user behaviors and interactions. Furthermore, it is important to consider bot filtering techniques to ensure the accuracy and reliability of the findings. This step is particularly crucial for extending the research beyond Twitter and examining policy adoption in non-cyber spaces. By exploring these avenues of research, future studies can advance our knowledge of policy adoption on social media, providing valuable insights into user behaviors, motivations, and the effectiveness of policy interventions. Finally, this study’s data collection and storage methods share similarities with those employed in prior efforts [53]. However, there remains significant potential for innovation in this area.