Introduction

For executive bodies, social media is a primary means of communication and engagement with the general public. In the light of increasing politicization of European integration [1,2,3], social media could provide an important communication channel for European institutions, as their wide outreach in theory allows them to actively shape public perception and influence debates on European integration [4,5,6].

Supranational executive bodies like the European Commission (EC) can especially benefit from being able to directly reach out to the wider European citizenry through social media. The European Commission has always occupied a unique institutional space, often referred to as a missionary bureaucracy [7], tasked formally with a political role as guardian of the European Union treaties, as well as the traditional technocratic functions associated with its executive status. Social media provide strategic channels to increase everyday communication with citizens. In the absence of other direct accountability mechanisms, these channels can provide an opportunity for institutions like the EC to legitimize their activities in the public eye. They also allow the EC to contribute to, and potentially shape debates about the role that the EU plays in the lives of its citizens and perhaps even to influence emerging conceptions of European identity. Yet, a detailed analysis of how the EC has used the strategic potential offered by social media, and how its communication has evolved, is currently lacking.

Fig. 1
figure 1

Number of tweets per day produced by the EC’s Twitter accounts between July 2010 and July 2022. The light blue line represents the raw count of tweets for a given day. The darker line represents temporally smoothed estimates. Smoothing is performed by computing a rolling average over 7-day windows. The red horizontal line marks the average number of tweets per day over the full time span

Recent research has provided an important and relevant bird’s eye perspective on the social media communication of EU supranational actors [5]. In this paper, we build on this work and extend it, zooming in on the temporal evolution of topics and style of social media communication of the European Commission as the EU’s executive institution. We also extend Özdemir and Rauh’s [5] work by showing how the thematic focus of the European Commission’s communication has evolved over time.

Leveraging advances in public availability of social media data (e.g., access to historical data through the v2 Academic Twitter API) and state-of-the-art text modeling (e.g., [8,9,10]), we conduct a comprehensive analysis of the content and stylistic features of the European Commission’s communication on Twitter (@EU_Commission) between June 2010, when the official account was created, and July 2022. By analyzing how topics and stylistic traits of EC Twitter communication have evolved, and benchmarking the EC against a range of other institutions, we aim to understand how the EC has reshaped its projected institutional identity over the past decade, and how/whether its communication has adapted to better resonate with the European public.

Research questions

Our study is structured as follows. First, we extracted daily tweet volumes between 2010 and 2022, and compared the EC’s tweet volume with that of a range of reference agencies, including: (a) national governments with English as official language (UK and Scotland); (b) other legislative and executive EU bodies (European Parliament and Council of the European Union); (c) transnational economic and monetary institutions (European Central Bank and International Monetary Fund); (d) other transnational intergovernmental organizations (the United Nations and the Organisation for Economic Co-operation and Development, OECD).

Secondly, we extracted prominent topics in the EC’s Twitter communication and analyzed how the volume of tweets for each has changed over the years. Thirdly, we analyzed how the style of the EC’s tweets has evolved, especially focusing on style features that are associated with increased accessibility and engagement [11, 12], and compared the EC’s style with that of other reference agencies. Finally, we used predictive models to systematically analyze how topics and stylistic traits relate to engagement, and gain additional insights into which features of the EC’s Twitter communication resonate most with its public.

Through these analyses, our study addresses the following research questions:

  1. 1.

    To what extent has the European Commission used Twitter as a communication channel over the past years? How does the amount of tweets produced by the EC compare to that of other supranational and national agencies?

  2. 2.

    What are the main themes of EC’s Twitter communication? How have they evolved over time? Can this evolution reflect strategic attempts to generate higher engagement and support from its public?

  3. 3.

    How has the linguistic style of EC’s Twitter communication evolved over time? Is there evidence of it evolving towards more engaging and accessible messaging, a prerequisite for successful communication and increased self-legitimation? How does the style of EC communication on Twitter compare to that of other agencies—more specifically, is it less accessible than that of other agencies (as found for more traditional communication channels)?

  4. 4.

    Which topics and features of style relate positively to engagement from EC’s Twitter audience? Do expected relations between individual features of linguistic style and engagement hold for the EC public? Do predictive models lend evidence to the hypothesis that dynamic changes in topic and style of EC communication may have been beneficial to public engagement?

Theoretical significance and related work

By addressing these research questions, this paper makes a contribution to our understanding of supranational institutions’ digital communication strategies in general, and the European Commission’s communication on Twitter in particular. Beyond work by Ozdemir and Rauh [5], this paper contributes to a more general literature on government agencies’, and in particular, supranational institutions’ digital communication. Supranational institutions’ public communications matter because they present an opportunity for institutions to craft narratives about who they are and what their purpose is [13].

This is particularly important for the EC in times of heightened politicization, which refers to the idea that European integration has become the subject of “increasingly salient and polarised public debate among an expanding range of actors” (p.10) [1]. In this context, digital communication presents a tool for self-legitimation, whereby the EC tries to communicate to the public that it has a legitimate right to authority [5]. Research on public diplomacy has highlighted its increasing “digitalization”, which describes the long-term process through which digital technologies are influencing the “norms, values, working routines and structures of diplomatic institutions, as well as the self-narratives and metaphors diplomats employ to conceptualize their craft” [14]. Shifting communication modes and platforms therefore have implications both for the institutions’ audiences and subjects, and for the institutions themselves (see also [15]).

Research on digital diplomacy has investigated how the European Union uses digital tools to enhance its reputation and legitimate its policy objectives [16]. Conducting interviews with European External Action Service (EEAS) officials, Hedling (2020) finds that storytelling is an important part of EU public diplomacy, and a response to an increased need to create greater buy-in for EU policies among domestic audiences. A key objective of digital storytelling is audience engagement, rather than just passive information dissemination. This highlights the strategic importance of audience engagement as a tool and a goal for supranational actors like the EU.

Building on this research, one of the key research questions that this paper addresses is in how far the European Commission’s Twitter communication has evolved towards a more engaging linguistic style. In another recent application of the strategic narratives framework to the EU’s online communication using quantitative text analysis methods, Moral [17] examines how the EU used different narratives to manage its reputation during the 2020 outbreak of COVID-19 pandemic. Similar to this work, we use topic modelling to understand what key themes emerge and evolve in the EU’s online communication over time.

Finally, research on the EU’s digital communication has also pointed out instances where key narratives, such as on gender equality, are less prominently featured in the EU’s discourse than expected [18]. Similarly, with our over-time analysis of key themes in the European Commission’s Twitter communication, we contribute to a broad understanding of which theoretically foundational themes (such as the ones relating to identity) are absent from the EC’s discourse until more recent years.

Data sources and rationale

Our analyses are conducted on the full volume of tweets in English produced by the European Commission’s English Twitter account (@EU_Commission) and by a number of reference agencies. We selected two national executives (UK government, @10DowningStreet; Scottish government, @scotgov, selected because English-speaking), other legislative and executive EU bodies (European Parliament, @Europarl_EN; Council of the European Union, @EUCouncil), transnational economic and monetary institutions (European Central Bank, @ecb; International Monetary Fund, @IMFNews), and other transnational intergovernmental institutions (the United Nations, @UN, and the Organisation for Economic Co-operation and Development, @OECD). Our final database includes the full volume of tweets produced by the European Commission and each of these reference entities up to the beginning of the present study in July 2022.

Note that the decision to include only the European Commission’s main official account, rather than including, for example, tweets from individual members of the European Commission or Directorates-General (DGs), is motivated by a number of considerations. First, we were interested in capturing discourse from the European Commission in a broad and comprehensive fashion, rather than identifying granular trends in specific policy areas. Secondly, contrary to accounts of individual representatives, the European Commission’s main account provides uninterrupted data for the entire time span. Thirdly, focusing on the EU_Commission account ensures that views expressed in the tweets reflect views of the European Commission, rather than views of individual representatives. We acknowledge that our results might partly be a reflection of these choices, and that the patterns observed in our analyses may also be influenced by evolving social media management strategies related to the European Commission’s main official account. Analyses focusing on tweets from a more distributed set of sources might yield results that could fruitfully complement our work.

As to the choice of reference agencies, the rationale for selection was the following. We selected a range of agencies which would: (a) provide a suitable volume of tweets in English; (b) be comparable to the EC under different respects. The two national executives were selected because they are two English-speaking governments located in the same geographical area (note that the US government does not have). The UN and the OECD were selected because they are international organizations with comparable breadth in policy areas. Conversely, the ECB and the IMF were selected as a sample of institutions with technocratic focus (the former being a European institution too). While this is far from a representative sample of all comparable institutions, it provides a rich and composite benchmark against which EC communication can be compared and interpreted.

Data were accessed through the v2 Academic Twitter API (now discontinued), and the text of tweets was preprocessed following a standardized pipeline (see section “Further methodological details”). Tweet metadata such as time of creation and engagement metrics (likes, retweets, quotes, replies) were also downloaded. Engagement data were condensed into a single engagement metric (which we will henceforth refer to as “engagement”) by summing the number of likes, retweets, quotes, and replies. For all analyses including engagement metrics, we use raw engagement data, as the Twitter API does not expose the number of visualizations (which is needed to compute engagement rates) nor historical data on other potential normalization factors such as follower counts.

Methodology and results

Twitter use by the EC

To tackle our first research question on how much the EC has used Twitter as a communication channel compared to the other reference agencies, we computed the raw volume of tweets produced by each of these agencies since the time their official accounts were created. Table 1 displays the overall number of tweets and the number of tweets per day for each of the target entities (EU_Commission highlighted in bold). As the table shows, the EC ranks second among our selected entities, with 6.75 tweets per day, suggesting that the EC has made conspicuous use of this communication channel over the years.

Table 1 Overview of the tweet volume for each institution

Figure 1 displays the daily number of tweets produced by the EC between 2010 and 2022. After peaking in 2013 and 2014, the number of daily tweets (and its variability) have become fairly stable and revolving around the overall mean. More detailed visualizations of tweet volumes for reference agencies are displayed in the Appendix.

Main themes in the EC’s communication

Topic model

To answer our second research question on how the main themes in EC communication on Twitter have evolved, we used contextualized topic models [8, 19] to extract a set of interpretable topics from the full corpus of EC tweets. Topic models are a data-driven computational method that can be used to extract semantic themes commonly occurring in a large set of texts based on word co-occurrence statistics. A topic is, in fact, simply a set of words that tend to co-occur in the same texts, defining “semantic attractors” in the set of texts under analysis. Compared to traditional topic models, contextualized topic models leverage pretrained transformer models [20] to improve coherence and interpretability. To estimate the topic models, we implemented a robust model selection procedure and selected a pretrained DistilBERT model with 20 target topics and a 500-word vocabulary (see section “Further methodological details” for more information).

Our topic model identified the following core topics (displayed in Table 2), which we labeled by inspecting the top 10 topic-defining words and the 10 most representative tweets for each topic (see Table 17, provided in the Appendix), and we clustered into seven macro-categories:

  • Economic and financial policy, including: Economy and Markets; Finance and Trade; Growth and Global Development; Strategic Investments (e.g., recovery, research, and innovation);

  • Social policy, including Health; Citizens’ rights and integration; Human rights;

  • Environmental and digital policy, including: Digital policy; Digital and green transition; Energy, Sustainability and Climate;

  • Identity and citizen participation, including: Identity, Culture, and Citizen Engagement; Visions for the Future; Citizen Initiatives;

  • Governance, including: Internal governance; Trade, partnerships, and law;

  • Solidarity and humanitarian aid, including: Solidarity and emergency response; Humanitarian aid;

  • Communications and media, including Press conferences and official statements; Charts, links, and infographics; Live events.

Table 2 List of topics and topic-defining words extracted through contextualized topic modeling

How have topics evolved?

We used the topic model to quantitatively describe the thematic content of each tweet in the dataset. For each tweet and topic, we extracted a score quantifying the extent to which a given topic is represented in the tweet. Based on these scores, we classified tweets as belonging to one of the 20 topics identified by the model, by simply extracting the topic whose score is highest. Aggregating the volume of tweets for each topic over time yielded a time series that makes it possible to understand how the focus of EC communication on Twitter has evolved over the entire time span.

Figure 2 (left panel) displays the resulting aggregated data. The heatmap shows that, while up until 2016 EC Twitter communication focused heavily on economy-related topics, governance, and sharing of information on institutional events and dynamics, 2017 marks a turning point. From 2017 on, tweets on social policy, environmental and digital policy, identity and citizen engagement become prominent. Health becomes especially prominent in correspondence with acute phases of the COVID-19 pandemic (Fig. 3). At a more fine-grained level, environmental policies display notably higher tweet volume from 2020 on, while digital policy, citizens’ rights and integration, and human rights (prominent in 2018 and 2020) have witnessed a slight decrease after 2020. The number of tweets related to solidarity and humanitarian aid have increased dramatically in 2022, arguably a reflection of Russia’s invasion of Ukraine. The right panel of Fig. 2 zooms in on the volume of tweets per topic over the past 3 years. Health, Energy, Sustainability and Climate, Digital Policy, and Visions for the Future have been the most prominent topics on average between 2020 and 2022. On the other hand, Solidarity and Emergency Response, and Identity, Culture, and Citizen Engagement have been the most prominent topics in 2022.

Fig. 2
figure 2

Left: Topic volume for each of the 20 topics identified by our topic model, over the entire lifetime of the EC’s Twitter account. Colors represent the proportion of tweets, within a given year, that score highest on the target topic. Right: proportion of tweets per topic between 2020 and 2022. Average across all years is also displayed for reference

Estimates of the volume of tweets over time (Fig. 3) lend further support to these patterns. Focus on economy and governance-related topics, as well as on institutional events and infographics, has decreased steadily over time. Tweets on social policy, environmental and digital policy, and identity and citizen participation have increased steadily over time. Tweets on solidarity and humanitarian aid have steeply increased over the past year. Note that these trends are fairly consistent across individual topics within each topic category.

Fig. 3
figure 3

Topic volume over time for each topic. Darker lines represent smoothed data (rolling average over 21 days). Lighter lines represent raw data per day

Comparison with other agencies

The evolution in the amount of space given to different areas of policy on Twitter has radically altered the content profile of EC communication on Twitter. How do these changes affect the distance of the EC to other agencies, in a space that includes both technocratic agencies like the ECB, other European institutions, and elected national governments? To answer these questions, we quantified the similarity between topic volumes in EC Twitter communication and topic volumes for other agencies over time. We did so by computing month-by-month pairwise correlations between the proportion of tweets produced, for each topic, by the EC, and the corresponding values for other reference agencies (Fig. 4). Higher correlations denote a closer match between the two agencies in the amount of focus given to each of the 20 topics across.

The resulting values are displayed in Fig. 4. The data shows two notable trends. First, the focus of EC communication has become increasingly dissimilar to that of more technocratic institutions such as the ECB and the IMF. Secondly, although less dramatically, EC communication has also become more distinct from that of national governments in our sample. More complex temporal fluctuations, with an overall decrease in similarity, can be observed for the remaining agencies (see Appendix for a more detailed visualization). This suggests that, as a result of the changes in relative focus on different areas of policy described in the previous section (be it the result of a shift in policy priorities or of a shift in communication strategies), the content profile of EC communication has become highly distinct from technocratic institutions, and generally more distinct from other agencies, including national governments.

Fig. 4
figure 4

Similarity between topic distributions in EC’s Twitter communication and other agencies. Similarities (operationalized as rank correlations between distributions of topic volumes) are calculated on aggregate monthly topic volumes

Alignment with audience engagement

Changes in focus could arguably reflect—at least in part—changes in policy priorities and in the ways the European Commission presents its identity. Both changes in policy priorities and in communication strategy may be a reaction to increasing politicization, and an attempt to elicit more engagement from citizens and increase self-legitimation. If this is the case, and if the strategy is at least partly successful, over time, topics of EC communication should align better with preferences of the EC’s online audience.

To explore this question, we correlate the relative ranking of each topic in the volume distribution (i.e., how often each topic is tweeted about at a given time, relative to other topics) with its ranking in the engagement distribution at previous time points (i.e., how many engagements a given topic elicits on average, relative to other topics). We do so for every month between June 2010 and July 2022, to extract an estimate of how much the content produced by the EC on Twitter aligns with previous patterns of user engagement. Higher correlation values correspond to a high degree of alignment between EC contents and audience topic engagement. If correlations values increase over time, this can be interpreted as evidence of EC communication becoming increasingly more aligned with its social media audience. To test this hypothesis, we fit simple regression models with time as a regressor and correlation-based alignment estimates as outcome variable for each agency in our sample.

The results show that EC communication has indeed progressively evolved towards topics that better resonate with its social media audience (\(\beta = 0.0039\), \(p < 0.001\)). Notably, the EC has the highest regression coefficient among all institutions in our sample, suggesting that alignment between topic volumes and engagement patterns has increased at a faster rate in the EC’s communication, compared to other institutions. Full reports and visualizations of these results are provided in the Appendix.

Note that engagement counts reflect preferences of the EC’s own public, which is not fully representative of the general public. Self-selection processes play a role in both followership processes and presence on a social media platform. The interpretation of results involving engagement counts should account for these processes. For example, in this context, it is important to highlight that alignment between EC topics and engagements is reflective of the EC’s increased ability to generate content that resonates with its own audience, but it does not necessarily reflect increased ability to generate engagement in the general public.

Linguistic style of the EC’s communication

Linguistic style and engaging messaging

In previous paragraphs, we analyzed EC communication focusing on its content, both in a diachronic perspective and in relation with other agencies. We showed that EC communication has drifted away from its initial focus on economy-, finance- and governance-related topics, and shifted towards areas of policy (social policy, health, digitalization, solidarity, identity) that generate more public engagement and make its profile markedly distinct from that of technocratic institutions. We also noted that 2017 seems to mark a turning point, which we speculate could be a reaction to the Brexit referendum.

These results could be compatible with the idea that the EC’s Twitter communication may have evolved as a response to increasing politicization, using online communication as a means to increased self-legitimation, a process by which “authority holders engage in nurturing the belief in their claim to rule among relevant audiences” (p.134) [5]. To successfully increase self-legitimation, communication should be sufficiently engaging and accessible to the general public also from the point of view of linguistic style [5]. However, public communication from the EC on traditional media has been argued to be much less comprehensible than that of comparable institutions [12]. While we do not claim to make any inferences about the relationship between politicization and the described changes in the EC’s Twitter communication in this paper, future research should test these relationships.

To address our third research question on the linguistic style of the European Commission’s Twitter communication, we focus on describing how it has evolved over time and how it compares to that of other reference agencies. In doing so, we analyze two main classes of features: sentiment, a feature which has been shown to be related to engagement [21,22,23] and a broad feature set including several markers of readability, complexity, and use of platform-specific communication tools like emojis, hashtags and mentions. By tracking how these features have changed over time, and benchmarking the EC against other agencies, we seek to understand whether EC communication has evolved towards more engaging and comprehensible messaging, and how its present style compares to that of other agencies.

Features

To comprehensively describe readability, complexity, and platform-specific stylistic features, we use a number of fine-grained descriptors. These include proxies for word-level or lexical complexity (the average frequency of words in corpora of standard English, see [24], as well as average word length); proxies for sentence-level complexity (the number of words or characters in a sentence); features quantifying the extent to which text is action-oriented (e.g., the verb-to-noun ratio), with the rationale that more verbs and fewer nouns yield simpler and more accessible sentences [12]; compound readability indices [25,26,27,28,29,30,31]; the frequency of hashtags, mentions, emojis by tracking the present of distinctive non-alphabetical markers (#, @, unicode emoji codes); average lexical concreteness [32]. To extract these metrics, we use the Python packages TextDescriptives [33], and pliers [34]. Sentiment is quantified using a pretrained sentiment classification RoBERTa model [35] fine-tuned on sentiment classification for Twitter text ([36]; available at https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment).

Sentiment

Fig. 5
figure 5

Proportion of tweets with positive, neutral and negative sentiment over time. The dark line represents data for the EC. Lighter lines represent data for all other reference agencies. All time series are smoothed using rolling averages over 21 days

Figure 5 displays how the sentiment of EC tweets evolved over time, also compared to other agencies. The proportion of tweets with positive sentiment increased significantly from 2017 on, becoming markedly higher than the proportion of positive tweets produced by all other agencies. This is accompanied by a corresponding decrease in tweets with neutral sentiment, with the EC producing markedly fewer neutral tweets in recent years than all other agencies. The proportion of negative tweets remains roughly constant and comparable to other agencies, with the exception of a slight increase following Russia’s invasion of Ukraine.

Fig. 6
figure 6

Regression coefficients for pairwise comparisons between EC tweets and other agencies over time. Negative values (red) indicate cases where the EC tends to score higher on a given feature than the agency to which it is compared (x-axis). Positive values (blue) indicate cases where the EC tends to score lower on a given feature than the agency to which it is compared. Values are set to zero if the coefficient is not significant (\(p >0.05\))

We tested these patterns statistically, by fitting a robust regression model (using Huber loss to account for the presence of outliers, [37]) with a categorical variable coding for the agency authoring the tweet as predictor, and normalized sentiment scores as outcome variables. We fitted one model for each year, to highlight how differences between agencies have changed over time. The sign of coefficients provides information on whether a given agency tends to score higher or lower on the target metric, and their values provide information on the magnitude of the effect. The results are displayed in Fig. 6. Outcome variables were standardized before model fitting, to enhance comparability across features. Results for 2022 and immediately preceding years clearly show that EC communication is on average more positive, less neutral, and slightly less negative than that of all other reference agencies.

Readability and complexity

Figure 7 displays how the linguistic style of EC communication has evolved over time, focusing on a range of features that quantify text complexity at different levels. Note that complexity, here, refers to factors influencing the amount of effort and knowledge needed to process and understand a given text, and the dimensions chosen as proxies of complexity build on methodologies from previous studies [11, 12]. We extracted: an aggregate index quantifying overall readability (“reading complexity”, which synthesizes multiple readability indices using principal component analysis, see Further methodological details for more information); tweet length (synthesizing character, syllable, and word-level measures of overall length through principal component analysis, see section “Further methodological details” for more information); sentence length (a summary variable aggregating sentence length in characters and in words, extracted using principal component analysis, see section “Further methodological details” for more information); action-orientedness, quantified as verb-to-noun ratio [12]; indices of word-level complexity, including both the average length of words in a tweet (“word length”, extracted by aggregating syllable and character count per word using principal component analysis, see section “Further methodological details” for more information) and the average frequency score of words included in the tweet (quantified as Log10 frequency in the SubtlexUS corpus, [24]—note that more frequent words involve less complexity, as they are more familiar and refer to more common and simpler concepts); average lexical concreteness [32]; the ratio of emojis, hashtags and mentions over words in a tweet.

We observe the following patterns. Overall reading complexity has decreased slightly over time. A notable decrease can be observed for word-level length, accompanied by a slight increase in average word frequency. Verb-to-noun ratio has also increased slightly. These patterns suggest an overall decrease in lexical complexity and increased action-orientedness. On the other hand, tweet-level and sentence-level complexity have increased over time (arguably a reflection of Twitter’s extension of character limits in 2018), and concreteness has decreased. The use of hashtags and mentions has also decreased, while use of emojis has become more prominent.

These tendencies show that increased tweet- and sentence-level complexity is accompanied by decreased lexical complexity and increasing use of multimodal communication tools like emojis, which may result in simpler and effective communication. But how does the linguistic communication style resulting from these changes compare to that of other agencies? Is EC style overall simpler and more accessible?

To answer these questions, we performed pairwise statistical comparison between the EC and all other agencies for all the features analyzed in this section. Similar to the analysis performed for sentiment scores, we fit separate models for each year, and plot estimates over time to understand how stylistic evolved. We fitted robust linear regressions with Huber loss to account for the skewed distribution of these features, with a categorical predictor coding for which agency has produced a given tweet as predictor and the value of the target feature as the outcome variable. As for sentiment models, the sign of significant regression coefficient provides information on whether a given agency tends to score higher or lower than the EC on the target metric, and their values provide information on the magnitude of the effect. A positive coefficient for the ECB on sentence length metrics, for example, would suggest that the ECB produces, on average, longer sentences than the EC. Outcome variables were standardized before model fitting, to guarantee comparability across features.

Fig. 7
figure 7

Style features over time. Dark lines represent data for the EC. Lighter lines represent data for all other reference agencies. All time series are smoothed (rolling average over 21 days). Scores are normalized to zero mean and unit variance, scaling across all reference institutions for comparability. Frequency and concreteness indicate average word-level frequency, extracted from SubtlexUS [24], and average word-level concreteness for each tweet, extracted from [32]

The results are displayed in Fig. 8. Interestingly, overall reading complexity has decreased for the EC than for all agencies (except one). While EC tweets have evolved to be on average longer, they have also become less complex at the sentence level compared to those from all other agencies. As a result, the EC now produces tweets with average reading complexity lower than or comparable to other agencies. The picture is more heterogeneous for verb-to-noun ratio and word-level complexity. Verb-to-noun ratio is higher than technocratic institutions like the ECB and IMF, but lower than national governments and the European Parliament. Action-orientedness has evolved to become higher than technocratic agencies and other international agencies, but lower than national governments. On the other hand, EC communication is less lexically complex than technocratic agencies, but more lexically complex and less concrete than most other institutions, especially national governments. Finally, estimates for Twitter-specific features shows that the EC uses significantly more emojis than almost all other agencies, and more hashtags.

Fig. 8
figure 8

Robust regression coefficients showing group differences between each reference agency and the European Commission for all platform-specific communication features. The analysis focuses on data from 2014 to 2022, as for some agencies, tweets are only available from 2014 on. Positive coefficients indicate that the target agency scores on average higher than the EC on that descriptive. Parameters are set to zero if the p-value is above 0.05

Overall, these patterns show a marked improvement in accessibility of EC communication over time. EC tweets started off being complex in comparison to all other agencies, and evolved towards a style which is simpler than all other agencies on many metrics. Yet, for lexical accessibility metrics and concreteness, EC communication is still consistently more demanding than that of, for example, national governments, a pattern observed in previous work focusing on other media [12], and replicating the finding that EU supranational actors in general have a less accessible style than national governments [5].

Predictors of engagement across content and style

In the previous analyses, we showed that the style of EC communication has evolved to become more aligned with topics that engage the general public, and stylistically more accessible, engaging, and more multimodal, potentially a strategy to increase self-legitimation and support. However, we did not yet provide direct evidence of how each of the variables analyzed directly relate to engagement. In the following, we address our fourth and final research question and test whether there is evidence in our data that topics on which EC communication has focused increasingly, and features of styles (traditionally related to accessibility and engagement) analyzed in the previous section do, indeed, elicit more engagements (retweets, likes, quote-tweets, and responses). The purpose of this analysis is two-fold: first, we aim to corroborate the hypothesis that observed changes in topics and style may be related to engagement; second, we provide additional, data-driven insights on drivers of engagement specific to the EC’s Twitter public.

To do so, we trained an XGBoost model [38] predicting engagement values based on topic descriptors and style descriptors extracted for previous analyses. Engagement for any given tweet is computed as the sum of all engagement counts (likes, retweets, quote-tweets, replies) normalized (i.e., divided) by the number of users following the EC’s Twitter account on the day where the tweet is posted. We normalized by the number of followers, as normalizing by the number of impressions was not possible, since Twitter’s Academic API did not provide access to this information. Follower counts for each day were retrieved using open source functions from the following repository: https://github.com/ChRauh/PastTwitter, which crawl Wayback Machine (https://archive.org/web/) to retrieve follower counts at different time points. The earliest available data were from February 16th, 2013, and the latest available data within our time frame of interest were from June 15th, 2022. We used linear interpolation to infer follower counts for all days between the earliest and the latest available date for which follower counts could not be retrieved. Tweets outside this temporal range were not used for the predictive modeling analysis.

While associations between features and engagement could be tested using statistical significance testing, we chose a predictive setup using tree-based methods for a number of reasons. First, predictive out-of-sample testing yields higher generalizability [39]; secondly, tree-based methods are robust to highly heterogeneous predictors; thirdly, these methods are agnostic to whether relations between features and outcomes are linear, and to the distribution of input features. We performed random grid search to tune hyperparameters, and converged on a model that yields \(R^2 = 0.09\), and rank correlation between true and predicted engagement values of 0.39. Figure 9 displays feature importance values and SHAP values for the resulting model [40]. SHAP values are coefficients that make it possible to understand how a given predictor influences model outcomes (positively or negatively).

We observe that content related to identity and citizen engagement, citizen initiatives, human rights, and health is positively associated with engagement. These are all topics on which EC Twitter communication has placed increasingly more focus. Topics having a positive impact on engagement also include, perhaps surprisingly, internal governance and finance and trade, topics on which the EC has focused increasingly less over time. On the other hand, economy-related tweets (a closely related topic from which the EC has progressively drifted away) seem to be negatively associated with engagement. Interestingly, topics related to digital and environmental policy are not positively related to engagement in our model.

Fig. 9
figure 9

SHAP values for all predictors included in the XGBoost model predicting engagement data. Plots in the top row display mean absolute SHAP values for each predictor, a proxy for feature importance. Plots in the bottom row display raw SHAP values for all examples in the test set. For plots in the bottom row, high feature values corresponding to positive values on the x-axis indicate that higher feature values are associated with higher values of the outcome variable (the two variables are positively associated). Viceversa, low feature values in the negative half of the x-axis indicate that lower feature values are associated with lower values of the outcome variable (the two variables are negatively associated). For some variables, relationships are more idiosyncratic

Concerning style, we observe that tweets with lower lexical complexity, and tweets with lower overall reading complexity, as well as non-neutral tweets, elicit higher engagement, suggesting that changes in EC style may lead to more engagement. Interestingly, higher verb-to-noun ratio and higher concreteness yield less engagement, while longer tweets elicit more engagement—contrary to what we expected from previous literature on other media.

These changes in topics and style of EC communication over time could be due to strategic efforts to increase public engagement, but future research should test this link explicitly. Furthermore, it is important to highlight that increasing alignment with drivers of engagement might reflect efforts to produce content that aligns with preferences of the EC’s own audience, while not necessarily aiming or succeeding at increasing engagement with the general public.

Discussion

We leveraged advanced natural language process methods and publicly available Twitter data to describe how the content and style of the EC’s Twitter communication has evolved between 2010 and 2022. As well as providing a quantitative overview of how the focus of EC communication on different areas of policy has changed and how its style has evolved, we analyzed whether the EC has aligned with engagement patterns, devoting more attention to topics that generated higher engagement in the past and enhancing aspects of its messaging style that tend to yield higher engagement.

In analyzing dynamic changes in the content of EC Twitter communication, we observed a progressive detachment from topics that overlap with the focus of “technocratic” institutions included in our sample (e.g., economy, finance, and internal governance) and increased attention to topics related to social policy, environmental and digital policy, as well as identity and citizen participation. As a result of these changes, the EC has increasingly represented itself in ways that make it highly distinct from technocratic institutions, and generally more unique among comparable agencies. Furthermore, several of the topics on which the EC has focused more in recent times are positively associated with audience engagement. Additionally, the EC’s communication has increasingly aligned with the preferences of its online audience, with alignment between relative volume of each topic and relative number of engagements per topic at previous time points increasing steadily over time and at a faster rate than other institutions.

The style of EC communication has also evolved significantly over the years. We observed a notable shift in the sentiment of EC tweets. While tweets with neutral sentiment were prevalent in earlier years, the proportion of tweets with neutral sentiment has notably decreased over time, and the proportion of tweets with positive sentiment has increased steeply. Compared to other agencies, the EC currently produces significantly fewer tweets with either neutral or negative sentiment, and significantly more tweets with positive sentiment. Non-neutral sentiment has been previously identified as an important driver of online engagement and a factor influencing the potential of online content to spread [21,22,23]. These findings are in line with the results of our predictive modeling analysis, where we observe a negative association between neutral sentiment and overall number of engagements. Based on this evidence, this radical shift in the sentiment distribution of EC tweets—which may reflect an intentional change in communication strategy—can be considered an important step towards developing a messaging style that favors engagement and increases public support. Note, however, that our engagement metrics capture engagement within the EC’s own public, which might not be representative of the general public. Efforts to increase engagement by modulating content and style of tweets may thus not necessarily be targeted to and/or successful at increasing general engagement and support, but rather limited to increasing engagement within this self-selected readership or followers with similar characteristics.

The EC’s messaging style has also evolved in terms of overall readability and complexity. We observed that, while the overall length of tweets has increased—with the EC currently producing significantly longer tweets compared to all other institutions in our sample—EC messaging has become lexically simpler, more action-oriented, more readable, and more multimodal over time. These stylistic features are commonly associated with more accessible messaging, a prerequisite for successful communication [5, 11, 12]. As a result, the EC currently scores better than all other reference institutions on compound readability indices, and better than technocratic institutions on several fine-grained indicators of stylistic complexity and accessibility. However, we observe that the EC’s messaging style is still more lexically complex, less action-oriented and less concrete than other institutions in our sample, especially national governments. These findings are partly in line with recent research observing that the EC’s messaging style on traditional media seems to be less accessible than that of a range of comparable institutions [12].

In line with hypotheses from previous studies, our predictive modeling analyses provide evidence of negative associations between lexical complexity and engagement, and between overall reading complexity and engagement. Interestingly, however, we also observe that less action-oriented, less concrete, and longer tweets tend to yield higher engagement. These findings are in contrast with the assumptions made in previous studies (e.g., [12]). Preferences for longer, less concrete and less action-oriented text might be specific to the EC’s Twitter audience, or specific to Twitter communication. While our results do not support qualified inference on the drivers of these audience-specific effects, they raise novel questions on the relations between text descriptors traditionally associated with accessibility and empirical engagement patterns across multiple public communication channels and sociodemographic subgroups to be addressed in future research.

Finally, we observed that 2017 seems to mark a key moment of transition for EC communication both in relation to the evolution of its topics and in relation to the evolution of its messaging style. We speculate that this may be a reflection of the politicization shock represented by the Brexit referendum, a hypothesis that warrants further investigation in future research.

Future directions

Our study provides a comprehensive overview of how the topics and style of the EC’s communication on Twitter have evolved between 2010 (the year where the EC’s official Twitter account was created) and present days. Previous literature has underlined how social media can be an important strategic tool to increase perceived legitimacy and citizen engagement in contexts of high politicization [1,2,3], especially for supranational bodies with indirect electoral accountability like the European Commission [5]. Overall, our analyses show that the EC has evolved a communication strategy that better aligns with its audience. This suggests that the EC may have become increasingly more successful in exploiting the potential of social media to promote legitimacy and public support. However, empirical validation of the impact of EC social media communication on perceived legitimacy and public support is needed to corroborate this hypothesis.

To this end, future directions include more hypothesis-driven and experimental studies combining Twitter data with public opinion data sources, to directly investigate relations between changes in online communication and empirical metrics of perceived legitimacy and support. In addition, cross-referencing Twitter data with external policy documents and data from other communication channels (e.g., press releases) may help elucidate whether changes observed in the EC’s online identity are merely reflective of evolving policy priorities, or they result from effort to evolve more efficient communication and digital diplomacy strategies.

Finally, there are caveats concerning our findings on predictive relations between linguistic features and engagement data. More adequate indicators of tweets’ engagingness could be constructed by normalizing the raw number of engagements (e.g., likes or retweets) by the number of impressions, that is, the number of times the tweet appears on users’ timelines. Unfortunately, impression counts are not available through the Twitter API, and engagement data were normalized by follower counts, which is not the optimal proxy for engagement rates.

Conclusion

Even on the relatively short time scale offered by Twitter data, we were able to describe important dynamic trends that point to the evolving nature of EC online communication and its emerging digital identity. Our work represents a foundational step towards future hypothesis-driven research, and a demonstration of how state-of-the-art NLP tools can be used to analyze the complex dynamics characterizing the nature of supranational institutions’ digital communication.

Further methodological details

Text preprocessing

The preprocessing pipeline for text involved the following steps: (1) removing retweets and mentions, respectively identified as tweets beginning with the string “RT” (as per Twitter API convention), and tweets beginning with the symbol “@”. Links (identified by the presence of the substring “http”) were stripped from the tweet’s text, but tagged tweets by whether they originally contained a link as one of the features of interest for the stylistic analysis. We further normalized transcription for non-ASCII encodings (e.g., &amp -¿ &). We removed all tweets that were not tagged as English by the Twitter API (as indicated by the “lang” field in the API response), and all tweets that after preprocessing included no residual characters (e.g., link-only tweets). As some of our pipelines involve transformer models—which model both lexical and non-lexical components of linguistic input, have a rich token vocabulary, and are robust to misspelling and spelling variability—tweets did not undergo any further preprocessing. Some of the accounts included in the analysis occasionally tweet in languages other than English. This is also the case of the European Commission’s Twitter account—where a small percentage of tweets are in other languages, but for all non-English tweets, a corresponding tweet in English is available. As mentioned, the present study only focuses on English tweets.

Engagement metrics

The following fields were natively available from the Twitter API: number of retweets (“retweet_count”), likes (“like_count”), quotes (“quote_count”), and replies (“reply_count”). To extract a simpler summary measure of engagement, we compute the sum of all the four engagement metrics (henceforth: “engagement”). Note that the Twitter API does not provide access to the number of times a given tweet was visualized, a metric which is required to compute the rate of engagement, a more stable measure of engagement. Since visualization counts cannot be accessed, we normalize engagement counts by the number of followers of the EC account on the day where the tweet was produced.

Data splits

70% of the tweets from the European Commission’s official account were assigned to the training set for topic models, while the remaining tweets were evenly divided into a 15% validation and a 15% test set, used for early stopping and model selection. Compared to traditional approaches to topic modeling, where models are trained and evaluated on the same data, this procedure ensures that estimates of model performance are more robust and generalizable, and provide a better prior on the model’s potential to accurately describe new tweets, not included in the current sample. The same splits were used to estimate XGBoost models.

Contextualized topic models

Contextualized topic models have been shown to improve intra-topic coherence and inter-topic distinctness compared to both bag-of-words and more traditional neural topic modeling methods, by supplementing bag-of-words representations with representations from transformer models [20]. To estimate the best parameters for our topic models, we fitted a family of models varying in parameter configurations. We parametrically varied the number of target topics (varying between 10 and 100), learning rate (2e-2 to 2e-5), batch size (16, 64 tweets), and vocabulary size for the bag-of-words model (250, 500, 1000 words). We also varied the pretrained transformer model used to extract contextualized embeddings, experimenting with both a standard pretrained DistilBERT model [9], a version of DistilBERT fine-tuned on masked language modeling using our own tweet corpus, a range of sentence transformers [41], and a model fine-tuned on generic topic extraction from Twitter content available on the HuggingFace Model Hub (https://huggingface.co/cardiffnlp/tweet-topic-21-multi). To select models that not only describe the dataset used for the present study, but also have the highest chance of generalizing to future tweets, we extracted a training and a validation set from the EC’s tweets (covering 70%, 15% of the tweets respectively), trained topic models on the training set, and used the validation set as an evaluation set for early stopping. The best model was selected on the basis of topic coherence on a test set that included both EC test set tweets, and test sets from all other agencies. Coherence was operationalized as normalized pointwise mutual information (NPMI) between topic-defining words [42], a commonly used metric in the field. By estimating the model on European Commission’s tweets and evaluating on test sets from all agencies, we make sure that the selected model: (a) is specifically tailored to identifying semantic attractors in the EC’s Twitter communication; (b) also identifies a set of topics that are also represented in tweets from other agencies, allowing for comparisons. As neural topic models can display unstable estimates [43], we fit each model for 5 distinct runs. The best performing model was a pre-trained DistilBERT model with 20 target topics and a 500-word vocabulary (10-word NPMI = 0.064 on the unseen test set). The resulting model was used to extract per-tweet scores for each topic. These are used for analyses and visualizations focusing on the temporal evolution of topic volumes over time, as well as for predictive modeling analyses investigating the relation between linguistic features and engagement.

Aggregating style features using principal component analysis

We used Python packages TextDescriptives [33] and pliers [34] to extract all style metrics analysed in the paper. For features (e.g., readability, word length) for which multiple indicators are available on TextDescriptives, we performed principal component analysis to reduce overlapping feature sets to a single feature.

An aggregate readability index was extracted by performing a principal component analysis on the 7 readability indices available through the TextDescriptives package used for feature extraction, which included: the Flesch Reading Ease index, the Flesch-Kincaid Grade, the Gunning Fog index, the Automated Readability index, the Coleman-Liau index, LIX, and RIX. An in-depth description of each metric is provided in the package documentation, available at: https://hlasse.github.io/TextDescriptives/readability.html. PCA yielded a first component which explained over 80% of the variance, which was used as aggregate readability index for all analyses. Loadings were negative for all readability indices except for the Flesch reading ease, reflecting the fact that this index quantifies ease of reading, while all other indices quantify reading complexity. The resulting index is therefore to be interpreted as an indicator of reading complexity.

The sentence length variable used in the style analyses also results from a PCA conducted on the two variables available through TextDescriptives which quantify sentence length in terms of number of words and number of characters. For simplicity, we reduced them to the first principal component of a PCA estimated on both variables, which captured around 97% of the variance. Loadings for both variables were positive, meaning that the final aggregate variable displays higher values for longer sentences.

The same procedure was used to aggregate the two measures of word length (in characters and in syllables) available through the TextDescriptives package (with the first principal component capturing 92% of the variance and both loadings being positive) and to aggregate the three measures of overall text length (number of words, number of characters, and number of sentences), where the first principal component captured around 78% of the variance and all loadings were positive.

XGBoost model

To analyze the relation between topic and stylistic features and engagement, we fitted an XGBoost model, where we used the sum of all engagement types (likes, retweets, quotes, and responses) as the outcome variable, and all topic and style predictors mentioned in the respective sections as input features. We split EC tweets into 70/15/15 train/validation/test splits (see “Data splits” subsection), and used the validation set to perform grid-based hyperparameter optimization. We used Tweedie loss [44] to best fit the highly skewed distribution followed by engagement data. The best model (used for the analyses presented in the manuscript) had a rank correlation of 0.39 with engagement data on the test set, and \(R^2 = 0.09\) on the test set. We performed hyperparameter search over the following parameter grid, iterating over 2000 combinations:

  • learning_rate: [2e-5, 2e-3, 2e-2, 2e-1]

  • min_child_weight: [1, 5, 10, 50]

  • gamma: [0,.5, 1., 2.]

  • subsample: [.6,.8, 1]

  • colsample_bytree: [.3,.5,.7, 1]

  • max_depth: [2, 3, 5, 10, 20]

  • reg_alpha:[0,.1, 5.]

  • reg_lambda: [0,.1, 1.]

  • n_estimators: [1, 5, 10, 30, 50]

  • tweedie_variance_power: [1.01, 1.3, 1.6, 1.8, 1.99]

The best model (selected on the basis of its root mean squared error on the validation set) had the following parameters: learning_rate = 0.2; min_child_weight = 10; gamma = 1.0; subsample = 0.8; colsample_bytree = 1; max_depth = 3; reg_alpha = 0.1; reg_lambda = 0.1; n_estimators = 50; tweedie_variance_power = 1.3.