1 Introduction

Despite the fact that 97% of climate scientists agree on the fundamental principles of anthropogenic climate change (Cook et al. 2016), and despite several decades of intense and coordinated messaging around that fact, up to a third of people in some countries remain sceptical (Egan and Mullin 2017; Hamilton et al. 2015; Hornsey et al. 2022b; Polino 2019). This is a major problem: where the public is not convinced that humans are primarily responsible for climate change, they are unlikely to support efforts to engage in rapid, deep decarbonisation (Hornsey and Lewandowsky 2022). Understanding the predictors of climate scepticism is an important step toward developing tools to overcome its negative impact on societies’ collective ability to respond to the climate change crisis. Currently there is very little literature equipped to understand macro-level factors influencing public declarations of climate change scepticism around the world, especially publicly declared scepticism. The current study helps fill this gap by using a recently released Twitter dataset to model variations in levels of climate scepticism (a) across states in the U.S., and (b) across nations.

‘Climate scepticism’ is a term with broad and flexible implication (Hornsey and Lewandowsky 2022). It encompasses the outward rejection of climate trends as well as doubt about the impacts of climate change and the scientific consensus about the effectiveness of mitigation measures. The majority of research on predictors of climate scepticism focuses on individual-level predictors (e.g., people’s values, ideologies, cognitive styles, experiences with extreme weather events) (Hornsey et al. 2016). Although it is important to map individual-level factors associated with climate scepticism, it is well understood that the phenomenon cannot be understood exclusively as something that emerges in isolation within the hearts and minds of individuals. Rather, individuals are situated within economic and socio-structural contexts that also shape their orientations toward climate change (Chater & Loewenstein, 2023; Czarnek et al. 2021; Hornsey and Lewandowsky 2022; Smith and Mayer 2019).

Historically, research on climate change attitudes drew heavily on single-nation samples, and so little was known about the macro-level factors that shape climate change scepticism globally. In the last decade, however, growth in availability of multi-nation samples has helped fill this gap. For example, some scholars have examined indices of nation-level affluence as a factor that might contextualise people’s views on climate change. Much of this research was designed to test the ‘post-materialism hypothesis’, which proposes that richer regions are more likely to embrace progressive social movements such as environmentalism because their citizens are more likely to have satisfied their material needs for physical and economic security (Inglehart 1977, 1990). However, evidence for this notion is mixed. Dunlap and York (2008) and some international surveys suggest that climate change concern is greatest in nations with relatively low GDP (Kim and Wolinsky-Nahmias 2014; Lee et al. 2015; Mostafa 2016).

Analyses of international surveys also suggest that nation-level affluence may interact in interesting ways with education. In most nations, education is negatively associated with climate scepticism (Czarnek et al., 2021; Hornsey et al. 2016). However, in nations with high economic development, this relationship is moderated by right-wing ideology such that people become more politically polarised in their attitudes about climate change the more educated they are (Czarnek et al. 2021). One interpretation of this pattern is that, in less affluent nations, people’s views on climate science may be less prone to ideological influence and more responsive to immediate threats to security. Analysis of polls in Africa, for example, show that climate literacy is positively associated with both the perception and the reality of changing precipitation in the decades prior to the survey (Simpson et al. 2021).

Another macro-level variable that has received attention in the climate change literature – and the variable that forms the focus of the current paper – is a region’s per capita carbon emissions. The rationale for examining this variable is that per capita carbon emissions represent a proxy for the extent to which the production of fossil fuels is a significant part of the economy in which the individual is situated. In turn, this can be considered a proxy for vested interests associated with maintaining the economic status quo.

Historically, disinformation about climate change has been disseminated in a co-ordinated manner through well-funded networks (Almiron et al. 2020; Brulle et al. 2021; Dunlap and Jacques 2013; Jacques et al. 2008), and these networks are often supported by elements of the fossil fuel industry (Brulle 2014; Farrell 2016; Oreskes and Conway 2010) and by politicians who saw political opportunity in being seen to be protecting jobs in the fossil fuel industry (Hornsey and Lewandowsky 2022; Plehwe 2022). The vested interest explanation maintains that these campaigns of misinformation will be coordinated and targeted in regions where the economic stakes of acknowledging climate change are particularly dramatic (i.e., those with high per capita carbon emissions). Furthermore, the vested interests account suggests that citizens in regions with higher carbon emissions spontaneously fear the economic costs associated with decarbonising, and so have a psychological motivation to embrace campaigns of misinformation and to reject the climate science.

Some preliminary research found international patterns consistent with a vested interests account. For example, a comparison of 14 Western, industrialised nations showed that the belief that “Many of the claims about environmental issues are exaggerated” was greater among nations with relatively high per capita carbon emissions (Tranter and Booth 2015). Another survey of 25 nations showed that the link between conservative ideology and climate scepticism was greater the higher the per capita carbon emissions of the nation (Hornsey et al. 2018). Finally, analysis of 27 nations from 1996 to 2010 indicated that newspaper coverage of climate change (both sceptical and non-sceptical) was greatest in fossil fuel-producing nations that made commitments under the Kyoto Protocol (Schmidt et al. 2013). Together, these studies suggest that there is a different orientation toward climate change in nations with higher per capita carbon emissions: coverage of climate science is more at the forefront of media discussion (Schmidt et al. 2013), and belief in climate change is both more fragile (Tranter and Booth 2015) and vulnerable to political polarisation (Hornsey et al. 2018).

Although suggestive, it is difficult to draw strong conclusions from these studies because the number of sampled nations is too small. Many nation-level variables co-vary (e.g., culture, education, GDP) so it is difficult to disentangle the effects of one nation-level variable from potential confounds without high levels of statistical power (Hornsey et al. 2023). Geographically vast, high-powered surveys require significant resources to administer, so it is rare for academics to be able to self-fund samples that can comprehensively capture macro-level differences across regions. Some government-funded consortia – for example the 2008 Eurobarometer (Papacostas, 2013), Round 8 of the European Social Survey (Poortinga et al. 2019), and Cambio climático y opinión pública en América Latina (Polino 2019) – have measured climate change scepticism, but have < 30 nations in the data field, typically not enough to isolate the effects of one nation-level variable from other related variables.

The biggest limitation of the larger datasets from international consortia is that they do not measure climate scepticism per se, preferring instead to measure concepts such as concern about climate change, perceived risk, and perceived threat (e.g., World Values Survey, Lloyd’s World Risk Register, International Social Survey Program). Although these constructs tend to correlate with scepticism, they are clearly distinct: people can be concerned that climate change will have a negative impact, independently of the extent to which they perceive that humans are responsible for it.

One solution that allows for a direct measure of publicly expressed climate scepticism and circumvents the costs associated with releasing large multinational surveys is to extract and analyse large quantities of tweets. The accessibility of Twitter API and advances in tweet processing packages have led to a surge of climate-related twitter research over the past decade. The majority of this work has focused on measuring public views towards climate change by using various forms of natural language processing to classify the tone of climate-related tweets on a range of pre-specified scales, such as emotional valence (e.g., positive – negative) or sentiment (e.g., happy – sad). Examples include Cody et al. (2015), who found that tweets containing the word ‘climate’ were sadder than climate-unrelated tweets, and Veltri and Atanasova (2017) who found that anger was the most frequently associated emotional content with climate-related tweets.

Only a small number of studies have analysed tweets to identify the antecedents of climate scepticism. In a sample of 5.7 million tweets, Jang and Hart (2015) found that U.S. Republican held states were more likely to frame climate change as a hoax in the twitter discourse. Exploring gender differences, Holmberg and Hellsten (2015) found that males were more likely to reference Twitter accounts with a sceptical stance on climate change than females.

Little work has yet explored the macro-level antecedents of publicly expressed climate scepticism through the use of a large multinational twitter dataset. One exception to this is a recent analysis of Twitter data in the U.S., which found that climate change scepticism in a U.S. state was negatively associated with education and income levels in that state, and positively associated with that state’s carbon intensity and their propensity to vote Republican (Gounaridis and Newell 2024). This analysis was based exclusively on bivariate correlations, however, so it is difficult to know whether carbon intensity had effects independently of income, education, or political conservatism. Furthermore, the analysis is restricted to the U.S. Presumably, this was in part due to the large amount of processing required to extract, store, and classify data from around the world.

Fortunately, a dataset has recently been released that lays the foundation necessary to accomplish this task: The Climate Change Twitter Dataset (Effrosynidis et al. 2022; Effrosynidis, Sylaios, et al., 2022). Containing over 15 million climate-related tweets from around the world, the dataset uses state-of-the-art machine learning algorithms and methods to categorize the tone and content of the tweets. Most relevant to the current analysis is the classification of each tweet as sceptical (or not) towards the notion of anthropogenic climate change. To our knowledge, this is the largest publicly available climate-related Twitter dataset in existence. Upon making the data open access, the authors expressed a desire for others to subject it to social science inquiry, which is the goal of the current paper.

Although the dataset draws from 126 nations, approximately 56% of the geo-located tweets emerged within the U.S., offering the statistical power to examine macro-level factors that explain variations in scepticism across states within that nation. Once established at the intra-nation level, we sought to cross-validate patterns across the full 126-nation dataset.

A Twitter corpus is quite different from the large-scale climate scepticism surveys mentioned earlier, not least because the former captures publicly expressed beliefs whereas the latter capture privately held beliefs. However, this can also be construed as an advantage of Twitter data. Private beliefs may impact individual behaviours, but publicly expressed attitudes have multiplier effects: through well-established processes of normative influence, people’s individual actions can have effects on their peers, effects that can ultimately trigger tipping points that create collective change (Hornsey et al. 2021). In this sense, examining publicly expressed beliefs is getting closer to the heart of how climate scepticism is performed and how social influence is enacted.

Relatedly, The Climate Change Twitter Dataset is valuable because, in addition to coding for scepticism, it also codes for levels of aggressiveness. This provides insight not just into the content of public debates, but also the tone of them. Examining aggression in public discourse is particularly valuable when examining climate scepticism in countries such as the U.S., where climate science has been drawn into the suite of attitudes and beliefs that define people’s political identities in the eyes of others (Hornsey et al. 2022a, b; Mann 2012). This polarisation in discussion of climate change has frequently been used as an example of the so-called “culture wars” that describe and prescribe one’s ideological or political worldview. As implied by the “culture wars” phrase, the resultant discourse is not necessarily civil or neutral. Where scepticism and aggressiveness are high, it suggests not just high levels of rejection of climate science, but a debate situated in a heated (and possibly polarised) contest.

In sum, the current paper analysed an international Twitter database to examine the association between a region’s per capita carbon emissions and the tone and content of climate change commentary in that region. Hypothesis 1 is that climate scepticism would be higher in regions with higher per capita carbon emissions. Hypothesis 2 is that higher per capita carbon emissions would be associated with greater aggressiveness in tone of climate-related tweets in that region. Our first analysis examined these hypotheses across states in the U.S. We then sought to replicate these findings on the global dataset using nations as the unit of analysis.

2 Method

2.1 Individual-level data

Individual-level data used in our analysis originated from the publicly available Climate Change Twitter Dataset which was released and published by Effrosynidis and colleagues (2022a). This dataset contains over 15 million English climate-related tweets from 2006 to 2019 and includes several variables that characterise each tweet using state-of-the-art machine learning algorithms and methods. Complete details including links to source code can be found in the original manuscript. Further variables contained in The Climate Change Twitter Dataset that are not analysed here include sentiment, deviations from historic temperature, topics discussed, and environmental disaster events (Effrosynidis, Sylaios, et al., 2022).

2.1.1 Gender

Effrosynidis et al. (2022a) determined the gender of each tweeter (male, female, undefined) via the gender classifier ‘chicksexer’ python package. This package uses a LSTM neural network to estimate the gender of each tweeter via their username. Effrosynidis et al. (2022a) tested the accuracy of ‘chicksexer’ using 31,271 names from the Social Security Administration of USA dataset, observing that classified gender was 90% accurate with recorded gender at birth.

2.1.2 Scepticism

Effrosynidis et al. (2022a) classified each tweet using Transfer Learning (an application of natural language processing) as either supporting, not supporting, or neutral towards belief in anthropogenic climate change. To ensure the classification process focused primarily on scepticism vs. belief in anthropogenic climate change, Effrosynidis et al. (2022a) filtered The Climate Change Twitter Dataset to ensure the hydrated tweets contained at least one of the following keywords ‘‘#climatechange, #climatechangeisreal, #actonclimate, #globalwarming, #climatechangehoax, #climatedeniers, #climatechangeisfalse, #globalwarminghoax, #climatechangenotreal’’.

Although gaining a technical understanding of tweet classification via machine learning is quite complex, a conceptual understanding is relatively straightforward. First, a machine learning model is trained to classify tweets from a dataset that already had human reviewers manually classify tweets. This human-classified dataset serves as a ground truth against which the model’s predictions can be compared. Effrosynidis et al. (2022a) used the Twitter Climate Change Sentiment Dataset for this purpose. This dataset contained 44,000 climate-related tweets classified by three independent human reviewers as either supportive, not supportive, or neutral with regard to their belief in anthropogenic climate change. Overall, 6,416 tweets were separated from this dataset before training to be used for model accuracy evaluation, which involves comparing the classifications made by the model to human classifications. Classification accuracy in this case is defined as the proportion of classifications that the model makes that align with the human classifications. For example, if the model classified a tweet as sceptical, and the prior human reviewer classified the same tweet as sceptical, the model’s classification would be defined as ‘correct’.

Effrosynidis et al. (2022a) tested multiple machine learning models and compared their accuracy. The tweet classification model chosen to classify the tweets contained in The Climate Change Twitter Dataset (bidirectional encoder representations from transformers; BERT) was observed to be 78% accurate with the classifications made by human reviewers on the Twitter Climate Change Sentiment Dataset (Effrosynidis et al. 2022).

2.1.3 Aggressiveness

Effrosynidis et al. (2022a) classified each tweet using Transfer Learning as either aggressive or not aggressive. As Effrosynidis et al. (2022a) were unable to locate a human classified Twitter dataset on aggressiveness, they trained and tested their model on a combined dataset from two sources. The first source contained human-classified tweets as non-hateful or hateful, and the second source contained tweets classified by human reviewers as offensive or not offensive. Here, the ground truth for aggressiveness represented a tweet that was classified as either hate speech or offensive. The observed accuracy rate was 97% for this model. Although hate speech and offensiveness are distinguishable from aggressiveness, we agree with Effrosynidis et al. (2022a) that the term ‘aggressiveness’ is an appropriate umbrella label to describe the commonalities between hateful and offensive tweets.

2.1.4 Geolocation

Although the location from which a tweet was posted is available for download from the Twitter API, very few users of the app enable this feature when tweeting. To improve the quantity of geolocations available, Effrosynidis et al. (2022a) geolocated 5 million of 15 million tweets contained in the Climate Change Twitter Dataset using a self-developed python script that generated location from a combination of tweet geolocation and user location.

The coordinates generated by Effrosynidis et al. (2022a) were of sufficient fidelity for us to use the ‘sf’ package in R to hydrate the name of the country from which the tweet originated. From this, we created two datasets. The first dataset contained only tweets originating from states within the continental U.S. The second dataset contained 3,926,566 tweets from 126 nations. Although 3,928,222 tweets and 174 nations were present in the original Climate Change Twitter Dataset, some countries had very low numbers of tweets. To avoid the potential of an unrepresentative sample from these countries, we excluded countries from analysis if they had fewer than 100 tweets in the dataset.

2.2 Nation-level and state-level data

2.2.1 CO2 emissions per capita

There are multiple ways of measuring the per capita carbon emissions of a given state or country. Here, we elected to operationalise per capita CO2 emissions as the combination of emissions from fossil fuels and industry from each nation / state divided by total population. This measure, in essence, is a ‘production-based’ measure of per capita CO2 emissions that does not account for imported fossil fuels. We used this measure instead of other consumption-focused alternatives, as we believe it most accurately captures the degree of state collective vested interest in the fossil fuel industry. While consumption-based measures of CO2 emissions reflect reliance on fossil fuels more indirectly (as, for example, a source of energy or fuel) fossil fuel production is a more direct proxy for economic vested interests.

In our U.S. dataset, we sourced per capita CO2 emissions from www.eia.gov/environment/emissions/state/analysis/. In our multinational dataset, we sourced data published by https://ourworldindata.org/co2-emissions#per-capita-co2-emissions.

In addition to CO2 emissions per capita, we introduced three control variables into the dataset, each with known associations with climate scepticism in prior literature.

2.2.2 GDP per capita

GDP was operationalised in terms of purchasing power parity (PPP). PPP compares economic productivity and standards of living between regions, adjusting for different currencies. In our U.S. dataset, we sourced state-level GDP data published by https://apps.bea.gov/regional/histdata/releases/0616qgsp/index.cfm.

2.2.3 Education

To capture differences in schooling across U.S. states, we operationalised education as the percentage of state population with a high school diploma https://worldpopulationreview.com/state-rankings/educational-attainment-by-state.

2.2.4 Political orientation

Political conservatism has been documented as one of the biggest predictors of climate scepticism in the U.S. (Antonio and Brulle 2011; Brulle et al. 2012; Dunlap et al. 2016; Hamilton 2011; Hornsey et al. 2018). To capture differences in political opinion across U.S. states, we averaged the percentage of Democrat votes for each state across the total data collection period (2006–2019).

3 Results

3.1 Climate scepticism within the U.S

Initial analyses focused on examining patterns of climate scepticism and aggressiveness among the 49 states of the U.S. captured in the Climate Change Twitter Dataset. As described in 2.1.2, the Climate Change Twitter Dataset categorized each tweet into one of three categories: (1) expressing belief in anthropogenic climate change, (2) expressing scepticism about anthropogenic climate change, or (3) being neutral or irrelevant with respect to belief in anthropogenic climate change. Tweets in the last category were deleted from analysis. Coordinates generated by Effrosynidis et al. (2022a) allowed us to identify the state within which each tweet originated (Fig. 1).

Fig. 1
figure 1

Visualisation of 2,396,611 Climate-Related Tweet Locations Across Continental U.S

Next, we averaged the scepticism classification of all tweets in the state to create a total measure of state-level climate scepticism (not sceptical = 1, sceptical = 2). Scores closer to 1 indicated low amounts of scepticism, whereas scores closer to 2 represented high amounts of scepticism. Consistent with hypotheses, there was a general tendency for scepticism to be higher in states with higher per capita carbon emissions (ρ = 0.66, p < .001).

Before formally testing the association between per capita CO2 emissions and scepticism, it is important to acknowlege that the Climate Change Twitter Dataset contains non-repeated observations of individuals and repeated observations of states over time. Although our focus is on between-state differences in emissions, it is important to control for within-state variation over time. We did this by specifying a three-level logistic regression model using the “lme4” package in the R statistics program, similar to Duijndam and van Beukering (2021). We entered the tweeter’s gender at Level 1. At Level 2 we included within-state per capita CO2 emissions, state GDP per capita, and education rates, effectively controlling for time. These variables were group mean centred around the state mean to capture within-state variation over time. At Level 3, which is the key analysis to test our hypotheses, we added between-state per capita CO2 emissions, state GDP per capita, and education rates. Due to the inconsistency in measurement frequency between political orientation (every 4 year election cycle) and the other variables (yearly) we averaged the percentage of Democrat votes for each state across the total data collection period (2006–2019) and added this variable at level 3. All Level 2 and 3 variables were standardised to assist with comparing coefficients. Random intercepts were fitted for each state and state-year.

Consistent with Hypothesis 1, the Level 3 effects confirm that scepticism was higher in states with higher per capita CO2 emissions (see Model 1 of Table 1). Of the control variables, scepticism was also lower the more states voted Democrat and the more educated they were. Although the Level 2 effects were not a primary focus for analysis, we note that scepticism was also positively associated with within-state changes in per capita CO2 emissions over time, but negatively associated with within-state changes in GDP and education. At Level 1, males were more likely than females to have sceptical content in their tweets, consistent with previous twitter analyses (Holmberg and Hellsten 2015) and survey studies conducted in the U.S. (McCright and Dunlap 2011).

Table 1 Multi-level logistic regression models predicting climate scepticism and aggressiveness across U.S. states

We also ran additional multilevel models to test whether per capita CO2 emissions would predict the aggressiveness of tweets, controlling for the same covariates as in our analyses on scepticism. As seen in Model 2 of Table 1, greater levels of between-state per capita carbon emissions were associated with higher rates of aggression when controlling for the same variables as in Model 1. This is consistent with Hypothesis 2, that the tone of debate about climate science would be more aggressive in states that had higher per capita carbon emissions.

3.2 Climate scepticism across nations

We examined whether the findings would replicate using data from the 126 countries represented in the full dataset. A limitation of The Climate Change Twitter Dataset is its reliance on English-language tweets. English speaking twitter users in the Global South are likely to be some of the richest, most highly educated, and least representative citizens of these countries. This compounds a perennial problem with examining social media data internationally, which is that in lower-GDP and less-educated countries, internet penetration is often reserved for respondents who are relatively high in education and income. This creates a paradox such that individuals responding in low-GDP and less-educated nations might be disproportionately rich and educated. For these reasons, it would be misleading to control for nation-level education or GDP in these analyses.

Despite these limitations, it is worth noting that these international analyses found the same relationship between national per capita carbon emissions and scepticism (Model 1 of Table 2) that emerged in the state-level comparisons within the U.S. When expressing the effect of between-nation per capita CO2 emissions as an odds ratio, our model illustrates that the odds of a person being sceptical grow 1.17 times larger with each SD increase in emissions. In other words, a 1 SD increase in emissions leads to a 17% increase in the odds of a person living in such a country being sceptical about anthropogenic climate change. Again, this pattern emerged over and above a gender effect that emerged at the individual level (i.e., males are more sceptical). We also tested the association between per capita CO2 emissions and aggressiveness on the international sample (Model 2 of Table 2). Consistent with the findings of the U.S. analyses, per capita CO2 emissions was positively associated with aggressiveness of tweets. In sum, climate-related discourse on Twitter is somewhat more sceptical and aggressive in regions with higher emissions.

Table 2 Multi-level logistic regression models predicting climate scepticism and aggressiveness across 126 nations

4 Discussion

Our findings show that sceptical discourse on Twitter is greater among regions with higher carbon emissions per capita, an effect that emerged among both the states of the U.S. and across nations. The association between carbon emissions per capita and scepticism across states in the U.S. remained significant after controlling for political affiliation, GDP per capita, education, and gender. This pattern is consistent with a vested interests account: as speculated by previous theorists (Hornsey and Fielding 2017; Hornsey and Lewandowsky 2022), we found that expressed scepticism about climate change was more prevalent in regions where the challenges associated with decarbonising the economy are particularly intense.

Interestingly, we also detected that the aggressiveness of social media rhetoric about climate change is greater among regions with greater carbon emissions. This pattern suggests not just variability in climate change views in fossil fuel-reliant regions, but divisiveness. In short, regions with strong economic vested interests in fossil fuels host Twitter debates that are marked by high levels of scepticism and verbal aggression, a pattern that has previously been described as signifying “climate wars” or “culture wars”.

4.1 Limitations and future directions

Like any large-scale analysis of social media data, the current analysis carries limitations. Analysing millions of tweets worldwide offers a broad canvas on which to observe effects but does not offer a great deal of depth or closure around mechanisms underpinning effects. Guiding our theorising is a presumption that publicly enacted rhetoric about climate science is situated within broader campaigns of misinformation, which in turn align with deeper fault lines in society around vested interests. However, we acknowledge that the case is circumstantial, and future research would be necessary to “close the loop” on this argument by tracking misinformation flows.

Future research would also be welcome to nuance between different levels of vested interest, both conceptually and empirically. As described earlier, vested interests can exist collectively (e.g., profit motives at the sectoral or industry level; political opportunity at the government level). However, individuals may have their own vested interests to maintain the existing economic structures, perhaps because their own economic welfare rests on it. It should be emphasised, however, that the two levels of explanation are not mutually exclusive: it is possible that collective vested interests constrain or shape individual beliefs, but that individual vested interests influence people’s preparedness to believe and reward those agents of misinformation (Hornsey and Lewandowsky 2022).

It is important to acknowledge that The Climate Change Twitter Dataset itself carries limitations. Perhaps chief among these is the reliance on unsupervised machine learning to infer levels of publicly expressed climate scepticism from social media. This type of measure comes with an unavoidable level of noise which is, of course, offset by sample sizes that dwarf even the largest multinational survey efforts. A further source of noise in the dataset would arise from Twitter bots, which have been estimated to comprise roughly 5% of Twitter users (Duffy and Fung 2022). However, there are no grounds for believing these sources of noise are introducing systematic bias into the analysis or that the reported patterns are artifactual. Rather, it is likely that noise in The Climate Change Twitter Dataset is only serving to make associations between variables harder to detect.

There is good reason to believe that a reliance on English-language tweets has presented certain biases in the sample, particularly in the international analyses. It is for this reason that our focal analysis compared states within a single nation, allowing us to focus on the variable of interest – per capita carbon emissions – while restricting variation that might occur in the international dataset with respect to culture and language. Focusing on variation within the U.S. also improves confidence that we have access to a representative sample of residents, particularly given that cell phone ownership and social media usage rates are still high among low-income elements of the U.S. population (Mobile fact sheet 2021). Even here, however, the sample is likely to have under-represented twitter discourse among culturally and linguistically diverse communities within the U.S.

4.2 Conclusions and implications

These limitations notwithstanding, The Climate Change Twitter Dataset is valuable in two ways: (1) it offers the statistical power to assess patterns that had previously been detected in relatively small-sample contexts and (2) it is capable of assessing not just the content of people’s beliefs about climate science but also the tone with which they express those beliefs in public contexts. Datasets such as these help recalibrate the literature from emphasising individual-level explanations to viewing climate beliefs as a publicly enacted set of behaviours shaped by socio-structural forces.

For policymakers, educators and activists, the current data serves as a reminder of the importance of anticipating and defusing the role of vested interests in shaping public discourse about climate change. Given that the problem lies partly in economic realities, structural interventions may be necessary. As has been argued elsewhere (Hornsey and Lewandowsky 2022; Morrison et al. 2022), enduring solutions to the climate change problem may involve strengthening democracies (e.g., by shutting the “revolving door” between lobbyists and government officials; (Michaels and Ainger 2019) and increasing transparency in political funding.

Others have focused on structural interventions to reduce the quantity of misinformation online, or at least reduce its impact (Clayton et al. 2020). Educational interventions include debunking of myths, and pre-bunking or “immunisation” strategies, in which communities are educated about the strategies vested interests use to mislead or distort information (Ecker et al. 2022). Finally, the effects on aggression detected in the current paper suggest that successful transition to a decarbonised future may require development – and skilful application – of the fast-moving frontier of research on how best to communicate in polarised contexts (Jost et al. 2022; Moore-Berg et al. 2020). In sum, understanding the macro-level factors that shape orientations toward climate change potentially offers clues about how to target interventions, and does so in a way that acknowledges that the responsibility for change does not reside exclusively among individual community members.