Introduction

In 1987, the World Commission on Environment and Development, commonly known as the Brundtland Commission, published the report Our Common Future. This report included the definition of Sustainable Development as the one “that meets the needs of the present generation without compromising the ability of future generations to meet their needs”, and it was further subdivided into three pillars or dimensions: environment, society, and economy1,2. In the following decades, sustainable development grew in relevance as one of the main organizing principles for meeting human development goals, leading to the adoption of the 2030 Agenda for Sustainable Development by all United Nations Member States in 20153.

Vegetable oils are currently one of the major elements in the global food system, with oil-producing crops occupying 37% of agricultural cropland globally4. Their global demand has increased dramatically since 1960, with a 10-fold increase in soybean and a 25-fold increase in palm oil consumption by 20105. Vegetable oils are an important source of cheap dietary energy and an essential part of healthy diets6. Unsurprisingly, they can be found in more than 50% of packaged products in supermarkets and are also used for animal feed and biofuel production7. As a consequence, they have generated huge economic benefits for their producers8, and their current global market value is estimated at USD 265 billion4. They are especially important for some tropical and subtropical producers. For instance, in 2016, soybean constituted about one-third of the agricultural export earnings of Brazil9, and oil palm accounted for half of Indonesia’s agricultural export earnings.

Increasing oil demand has primarily been met by expanding oil production areas. However, most crop expansion has taken place in a handful of countries. In 2019, Indonesia and Malaysia accounted for 84.4% of the world’s palm oil production, while Argentina and Brazil accounted for 60% of global soybean oil exports10,11. The expansion of cultivated land in these countries is associated with dire environmental impacts, such as deforestation and threats to biodiversity4,12. Yet, even though palm oil accounts for 40% of global vegetable oil demand, it covers less than 5.5% of the total global oil crop area thanks to its large yield per hectare13. Thus, meeting the same demand with other options may be even worse in terms of environmental impact14. As a consequence, there is ample debate among the scientific community on the role of vegetable oils in the context of the Sustainable Development Goals (SDGs)13,15,16,17. For example, Fosch et al.18 suggest that enhancing smallholder plantations can benefit several SDGs, including No Poverty (SDG 1), Good Health and Well-being (SDG 3), Quality Education (SDG 4), Life on Land (SDG 15), and Peace, Justice, and Strong Institutions (SDG 16). Moreover, the consumption of these oils could contribute to achieving Zero Hunger (SDG 2) and Responsible Consumption and Production (SDG 12), while the production of biofuels may support Clean Energy (SDG 7). Furthermore, sustainable management of vegetable oil production is crucial for promoting Decent Work and Economic Growth (SDG 8), addressing biospheric SDGs (6, 13, 14), and advancing equality-related SDGs (5 and 10)4. Finally, misinformation spreading is connected to SDG 16, as it can influence public narratives and sharp conflicts. Given the notable alignment between vegetable oils and the SDGs, stakeholders could raise awareness of this relationship on social media platforms, highlighting the pressing challenges addressed by the 2030 Agenda3. In this paper, we aim to understand how the public perceives this debate on social media and which different SDGs are perceived as more relevant by the public sphere19.

Social platforms have played an important role in shaping public opinion on vegetable oils since their early beginnings. In March 2010, the environmental protection group Greenpeace kicked off a social media campaign against one of the largest food processing companies in the world, Nestlé. The campaign revolved around the use of palm oil in one of their best-known products, claiming that its production led to the destruction of rainforests. In spirit, it was similar to the one held in 2008 against Unilever, which was the target of ‘raids’ by Greenpeace activists dressed as orangutans (Pongo spp.) in several facilities. This time, however, the campaign was initially fully virtual, starting with a video on YouTube and then moving to Nestlé’s Facebook page when the video was taken down. In just a few days, the campaign jumped to Twitter (now X) and the mainstream news headlines. The reputation crisis led Nestlé to suspend purchases from one of its suppliers, join several sustainability initiatives, and promise an analysis of its whole supply chain. This marked a new era for social protests on corporate practices20,21.

Given this context, we focus our analysis on the microblogging platform Twitter, which is commonly used by companies to build reputation, disseminate word-of-mouth campaigns and shape the company’s image22,23. On this platform, users may follow other members without the need to reciprocate. This asymmetry facilitates the creation of opinion relationships and communities24. Besides, its wide audience from various backgrounds and geographical locations is especially interesting in studying global opinion trends25. This characteristic of the platform is unfortunately leveraged by certain groups to spread false news and misinformation26,27.

In spite of the important relationship between public debate and social networks, the literature on vegetable oils and social media is relatively small. Pradipta and Jayadi28 tested two sentiment analysis techniques with a small dataset of 26,469 tweets on palm oil collected between July and September 2021, but did not interpret their output. The media analytics company Commetric published a report on the brands most often referenced in the palm oil debate based on 4,503 tweets collected between September 2018 and September 201929. They found a low overall engagement with palm oil news, with bursts of activity caused by stories about the damage of palm oil production to orangutan habitats. Ruggeri and Samoggia30 analyzed 16,764 tweets collected from 198 palm oil agri-food chain companies. They observed that palm oil producers actively used Twitter to promote palm oil sustainability, while European manufacturers and retailers limited their activity to react to consumers’ questions. Khairiza and Kusumasari discussed the effect of a palm oil social media campaign from the Indonesian government using 9,628 tweets collected from September to November 201931. Finally, van Rossum32 conducted an analysis of Twitter within the framework of Habermas’s concept of the public sphere. Utilizing a case study of 103,500 tweets about palm oil collected from July to September 2018, the study concluded that Twitter does not provide the ideal platform for the formation of public opinion as envisaged in Habermasian terms.

Outside of Twitter, Teng et al.33 explored 4260 posts from YouTube and Reddit on palm oil. Other authors examine the value social media can provide as a marketing tool34 or for agricultural technology and information expansion35,36 in the context of palm oil. Notably, all these references revolve around palm oil, neglecting the contribution of other vegetable oils to the broader debate on the sustainability of the global food chain. Interestingly, in an investigation by Alvarez-Mon et al. on social attitudes towards the Mediterranean diet, based on 1608 tweets published by 25 US media outlets between January 2009 and December 2019, olive oil was the element of the diet that generated the least attention37. To conclude this brief overview, it is worth mentioning that in De Lima25 and references therein, the author analyzed the discussion on topics related to sustainability in Twitter, but none of these explicitly mentioned vegetable oils. Our approach sets itself apart by delving into a substantially broader dataset, encompassing a wide array of vegetable oils over a span of 15 years. This broader scope enables us to examine global behavioral trends and long-term impacts, moving beyond the confines of specific events, thereby contributing to a more comprehensive understanding of the discourse surrounding vegetable oils.

In this study, we look into the global discourse on vegetable oils by analyzing over 20 million tweets, revealing key factors shaping public opinion. We discovered that coconut, olive, and palm oils dominate social media discussions disproportionately to their global production. Specifically, discussions about olive and palm oils correlate with Twitter’s growth, while coconut shows more sporadic activity. Conversations around coconut and olive oils focus on health, beauty, and food, whereas palm oil is predominantly linked to environmental concerns. Our findings indicate that viral content often revolves around environmental issues and negative connotations. This research underscores the complex nature of the vegetable oil debate and its divergence from scientific discussions, highlighting the influence of social media on public perception and providing valuable insights for sustainable development strategies.

Results

Vegetable oils presence in Twitter

We begin the characterization of the vegetable oils debate on Twitter by looking at the cumulative number of tweets associated with 10 major oils, Fig. 1a (see Methods). We observe two clear groups of oils: coconut, olive, and palm oil, with 5 million, 6.5 million, and 3.7 million tweets, respectively, and the rest with fewer than 300,000 tweets each (see Supplementary Information, Fig. S1 and Table S1, for more details). Note that this is not proportional to the production volumes of these oils. The world supply of olive and coconut oils is lower than 4 million metric tons. In contrast, the world supply of palm, soybean, and sunflower oils in 2022 was 76, 59, and 20 million metric tons respectively38, with soybean, with 130 million hectares of land, having by far the largest environmental footprint of all oil crops39. Thus, the major oils in terms of social media presence are coconut, olive, and palm, regardless of their actual supply. Henceforth, we focus our analysis on these three oils.

Fig. 1: Anatomy of the vegetable oils presence in Twitter.
figure 1

Tweets sent from 3/21/2006 to 12/31/2021 containing the bigram “type oil'', where type corresponds to any of the oils listed in the legends. a The cumulative number of tweets for each oil. There are three major oils in terms of social media presence: olive, coconut, and palm oils. b The monthly number of tweets for each of the three major vegetable oils. c Growth relative to 2007, measured as the total number of tweets in a given year over the number of tweets in 2007. We compare the growth of tweets on each vegetable oil with the growth of the 100 most common words in English as a proxy for the growth of Twitter (see Methods). Error bars indicate the standard deviation of the growth for the common words. d The number of geo-tagged tweets per country containing the bigram “palm oil” in 102 languages (see Methods). The total number of tweets collected in any language is 7,946,915, of which only 64,682 are geo-tagged.

In Fig. 1b, we show the monthly number of tweets of the three major oils. Olive oil is characterized by a quick upward trend until 2013, followed by a slow decay, which reverts in 2018. Furthermore, there are few bursts of activity, which are indicators of viral events. Coconut oil has a smaller initial growth, but between 2014 and 2018 is the oil with the most monthly tweets. The interest in coconut oil decayed from the peak of 2016, with some noticeable bursts until 2022. Lastly, palm oil is characterized by a considerably slower but constant upward trend, with a major viral event in November 2018 with twice as many tweets as the largest burst of any of the other two oils analyzed.

The increments in the number of tweets could mean either that these topics capture more public attention or they might be a direct consequence of the own growth of the platform. To elucidate this, in Fig. 1c, we show the relative growth in the number of tweets for each oil since 2007, compared to the growth of tweets containing any of the 100 most common words in English as a proxy for Twitter’s growth (see Methods). At first glance, the overall trend of tweets containing common words is similar to the palm and olive oils tweets. Through the Granger causality test, we confirm that the temporal trend of common tweets significantly predicts the trends in tweets about palm and olive oils (p-value < 10−10). This predictive association, however, does not extend to coconut oil tweets (p-value 0.09). These findings suggest that a notable proportion of the rise in tweet volumes can be attributed to the general expansion of Twitter’s user base, rather than specific increases in interest or discussions about these oils. Further details are available in the Supplementary Information, Figs. S2–3 and Tables S2–S4.

The growth of olive oil tweets is smaller than the average growth of common tweets, signaling that it is a topic that does not gather special attention. Besides, the decay from 2013 to 2018, followed by an upward trend, is similar to the one seen in common words, reinforcing the idea that its evolution does not differ from the average conversation on the platform. Conversely, the conversations on coconut and palm oils grow much more than common tweets, indicating an increase in public interest. However, coconut oil shows a very fast growth only until 2016, the point at which it starts to decline steadily.

To conclude this initial characterization, we collected all tweets containing the bigram “palm oil” written in 102 languages (see the Supplementary Information, Table S3, for a complete list of the bigrams). There are 7,946,915 tweets in this set, of which roughly 50% are in English. In Fig. 1d, we show the number of tweets that are geo-tagged in each country (see Methods). As expected, given the origin of the aforementioned campaign, the country with the largest number of tweets is the United Kingdom, followed by the United States, Malaysia, Nigeria, and Indonesia. Notably, Nigeria used to be the largest producer of palm oil in the world in the 1960s, and it is currently the largest consumer in Africa40. Thus, the debate is mainly centered in the UK and the US, together with countries that are historical palm oil producers and major consumers. While the geolocalized results provide valuable insights into regional representation, it is important to recognize the limitations resulting from the comparatively small sample size of tweets with geographic information. Specifically, less than one percent of tweets within our dataset provide such geographic context. Furthermore, it is worth noting that our analysis presents results in absolute terms, without stratifying by countries with different population sizes, and that geo-tagging a message is optional, and we cannot assume that individuals in all countries have the same likelihood of doing so.

The debate around vegetable oils

Next, we look at the content of tweets to characterize the debate. First, we extract the top 10 hashtags in each set of tweets in English (see Methods). These hashtags can be used as a proxy for the most common topics associated with each oil. In Fig. 2, we show the percentage of tweets that contain each of these hashtags. For coconut oil, we find that most hashtags are related to health or beauty. There is also an important presence of tweets containing giveaway or win, which could be related to marketing campaigns. The topics around olive oil are similar, although there is a larger presence of keywords related to food and nutrition instead of beauty. In contrast, in the case of palm oil, most tweets are related to the negative environmental effects associated with it. Most of them are against palm oil production, except for sustainable and rspo, which stands for Roundtable on Sustainable Palm Oil, an organization with the objective of promoting the growth and use of sustainable palm oil. These palm oil-related hashtags are closely linked to SDGs 12, 13, and 15 (Responsible Consumption and Production, Climate Action, and Life on Land). Additionally, coconut and olive oil hashtags could be linked to SDGs 2 and 3 (Zero Hunger and Good Health and Well-Being).

Fig. 2: The debate around vegetable oils.
figure 2

The topics associated with each vegetable oil can vary greatly. ac Show the top 10 most used hashtags for each set of tweets, expressed as the percentage of tweets containing the hashtag, for coconut, olive, and palm oil, respectively. Palm oil is mainly related to environmental issues, while coconut and olive are predominantly associated with nutrition and health topics. d Reports the results of a Latent Semantic Analysis (LSA) applied to the set of tweets (see Methods). Each point in the plot represents a tweet, with color indicating the vegetable oil it is associated with. The closer the two points are in space, the more similar their topics are. Lastly, (e) depicts a word cloud of the top 2000 hashtags in the palm oil dataset. Font size is proportional to the number of occurrences in the dataset so that the least used hashtags are hardly visible.

Not all tweets contain hashtags. We only find hashtags in 17.7%, 19.5%, and 25.9% of the tweets in the sets of coconut, olive, and palm oil, respectively. To broaden the analysis, we apply Natural Language Processing (NLP) techniques to visualize tweets according to their main topics using the whole text rather than just hashtags. In particular, we apply a Latent Semantic Analysis (LSA), which projects the corpus of tweets into a 2-dimensional space. In this projected space, if two points are close, it implies that their topics are closely related (see Methods). As we can see in Fig. 2d, coconut and olive oil tweets have a wide distribution of topics. In contrast, the diversity of topics in tweets related to palm oil is much smaller, with most of them concentrated around the same axis. This confirms the results of the hashtag analysis. While coconut and olive oil are associated with beauty/food and health, the palm oil debate on Twitter is mostly focused on the negative environmental impacts of its production. This is further validated by Fig. 2e, which shows a word cloud of the top 2000 hashtags for the palm oil dataset. Most hashtags, even beyond the most common ones, are related to environmental issues, such as ‘orangutans’, ‘deforestation’, ‘malaysia’, ‘Indonesia’, and ‘palmoilkills’. Note that, as indicated by the smallest font size in the visualization, the less important (frequent) words are hardly visible.

We extend this analysis by looking at the sentiment associated with those topics. We employ a model trained with Twitter data for sentiment analysis. This technique estimates whether the opinion expressed in a text is positive, negative, or neutral (see Methods). In Fig. 3, we report the fraction of tweets of each oil associated with positive, negative, or neutral sentiments. The fraction of neutral tweets is close to 50% in the three cases, a value that is commonly found in sentiment analysis of Twitter independently of the specific topic41,42,43. In contrast, the fraction of tweets associated with negative sentiments in the palm oil dataset is 4 times larger (42%) than in the coconut (12%) or olive oils (10%). If we look at the evolution of tweets in each of these categories (panels d-f), we observe that their distributions are relatively stable across time (see also Fig. S4 in the Supplementary Information) except for some viral events.

Fig. 3: Sentiment analysis.
figure 3

ac Contain the fraction of tweets associated with each sentiment for each of the three datasets under consideration (coconut, olive, and palm). In the three oil cases, roughly 50% of the tweets are considered neutral, but the situation is completely different for negative ones. The datasets of coconut and olive oils contain about 10% of tweets labeled as negative, while this number increases to 42% for palm oil. d Coconut, e olive, and f palm show the monthly number of tweets associated with each sentiment. The share of tweets in each category is fairly constant, except for a few viral events, such as the one related to coconut oil in March 2019 and the one associated with palm oil in November 2018.

In the context of sustainability debates, we concentrate on the case of palm oil in 2018 due to two important events that occurred during this year: the release of a report on palm oil’s impact on biodiversity by the International Union for Conservation of Nature (IUCN)44, and a widespread social media campaign by Iceland Foods and Greenpeace (see Discussion). Figure 4 presents a detailed breakdown of these dynamics. Panel 4a demonstrates that the IUCN report, despite its scientific significance, did not trigger a substantial public response, with tweets mentioning the IUCN peaking at a modest 2,500 in the week of the report’s release before rapidly fading away. Intriguingly, the usage of the term ’biodiversity’ shows a high correlation with the appearance of ‘IUCN’, suggesting that public discussions around biodiversity were mainly restricted to the context of the IUCN report. This indicates a disconnection between the scientific and public discourses on this matter, as the issue of biodiversity seemed largely absent from the broader public debate.

Fig. 4: The palm oil debate in 2018.
figure 4

a Displays the weekly count of tweets containing the keywords Biodiversity, Iceland Foods, IUCN, and Orangutan, illustrating the varying public attention to different facets of the issue. This showcases two important but differently echoed events: the IUCN report on palm oil’s impact on biodiversity in late June and the widely noticed campaign against palm oil by Iceland Foods and Greenpeace in Christmas. b Visualizes the sentiment (positive, neutral, negative) associated with tweets containing the terms IUCN or Biodiversity, demonstrating the emotional response to scientific discussions. c Presents the sentiment associated with tweets containing the terms Orangutan or Iceland Foods, reflecting the emotional impact of the viral campaign.

In stark contrast, the social media campaign involving Iceland Foods and the symbol of the orangutan spurred a much larger response, as reflected in the volume of related tweets which reached nearly 25,000 during the campaign’s peak. This illustrates the powerful role of emotive campaigns in sparking public interest and engagement. Panels 4b and 4c provide an insight into the emotional tone of these debates. Most tweets related to the IUCN report were neutral, likely reflecting the typically unbiased language used by scientific agencies when disseminating information. However, a considerable portion was also negative, possibly signifying a public pushback against the report’s findings, while positive tweets were conspicuously sparse. In contrast, the sentiment analysis for the social media campaign revealed a different pattern. The majority of tweets were negative, indicating a strong public reaction against the use of palm oil due to its ecological impacts. Positive tweets were also present, while neutral ones were very few, highlighting the strong emotional response triggered by the campaign. This suggests that emotive content can play a critical role in shaping public opinion and engagement on sustainability issues.

Emergence of viral events

Information often spreads through social networks as an avalanche. Users may see some information and desire to share it or comment on it. Then, an avalanche spreads if other users continue to spread that content45. Some topics may be commonly discussed, reaching a large volume of tweets over a long period. Conversely, certain ideas may spread very fast and reach comparable volumes but in a much shorter time. These are known as viral events - events that propagate widely and rapidly46. It is thus essential to study simultaneously both dimensions of the process: (i) the speed of the diffusion and (ii) its reach. To do so, we extract the set of tweets containing each of the 10 most common hashtags concerning each oil. Then, we measure the interevent time (IET), defined as the time interval between two consecutive tweets containing the same hashtag. To study the reach of a hashtag, we look at the daily number of tweets associated with it. We call this quantity the cascade size (CS) (see Methods for further details).

Since we have over 15 years of data, a certain hashtag may participate in multiple viral events. Hence, rather than focusing on a particular period, we study the overall IET and CS distributions for each hashtag under consideration. We fit these distributions using power-law functions (p(x) ~ xα) characterized by the scaling parameter, α, which are commonly found in any human activities47 (see Methods). In Fig. 5, we represent the scaling parameter of the CS distribution, αCS, versus the IET distribution, αIET. These functions exhibit, in the limit x → , a well-defined mean only when α > 2, and a well-defined variance when α > 3. For finite systems, when α < 3, the average fails to capture the dynamics accurately due to pronounced fluctuations. These distinct regimes form the foundation for our definition of a virality phase diagram as a function of α. Hashtags can then be characterized according to the area in which they lay in this plane.

Fig. 5: Virality phase diagram.
figure 5

Each point represents one of the 10 most common hashtags in the coconut (purple circles), olive (light green triangles), and palm (dark green squares) datasets. The spatial coordinates are the scaling exponents of the interevent time (x-axis) and the cascade size (y-axis) distributions, together with their 95% confidence intervals. Gray lines delimit the areas defined by the critical exponents (see Methods). Pie charts show the distribution of positive, neutral, and negative tweets with at least one hashtag in each of these areas.

As we can see in Fig. 5, the hashtags associated with each type of oil tend to fall into similar categories. For instance, olive oil tweets are mostly found in regions III and VI, characterized by well-defined volumes and relatively slow dynamics. In contrast, most palm oil tweets are in Region II, which we call the viral region. Hashtags in this region have well-defined CS and IET averages, but their respective variances diverge. This implies that these events do not have easily predictable dynamics, as it is possible to have some cascades with just a few tweets, followed by a period of silence, and then a large burst of tweets in a short period. Interestingly, we observe that the hashtags associated with this region are the ones most related to environmental issues, especially those with negative connotations. The few hashtags in Region V, with well-defined sizes but unpredictable interevent times, are those supporting palm oil, such as “rspo” or “sustainable”.

These observations are further supported by the sentiment distribution of the tweets associated with each region. Regions I and IV contain mostly positive tweets. As we can see, those hashtags, except for “killerpalm”, are related to freebies given during marketing campaigns, which explains that sentiment. Besides, the short duration of these campaigns has an important effect on the IET distribution. The hashtag “killerpalm” shares these characteristics since it was related to a short-lived environmental campaign that took place during the 2015 United Nations Climate Change Conference (COP21)48. The small fraction of negative tweets observed in this area belongs to this hashtag. In Region II, as previously discussed, we find most hashtags related to environmental issues and palm oil, which yields a large number of tweets with negative sentiment. In Region V, positive tweets tend to be associated with coconut oil, while a relatively large fraction of negative tweets are associated with palm oil. Lastly, in regions III and VI, since all of them are related to olive oil, most tweets are neutral, in agreement with Fig. 3b.

Discussion

The role of vegetable oils in the context of the Sustainable Development Goals is far from trivial, as they may have a positive impact on objectives such as no poverty (SDG 1) or zero hunger (SDG 2) but a negative one on climate action (SDG 13) or life on land (SDG 15). This complex relationship has sparked an important debate within the scientific community and, thanks to social media, also within the general public. However, it is crucial to comprehend the factors that influence public opinion and the role of social media in shaping public discourse. Only then we would be in the position to use collected data for better policymaking. It is important to acknowledge that our study does not provide a comprehensive representation of public sentiment on these topics. Instead, our focus was on examining online interactions within the context of the platform. We must emphasize that our goal was not to establish causal relationships but rather to observe platform-specific online behaviors with the intention of suggesting improved engagement strategies for future discourse.

Guided by a knowledge gap about the relationship between public perceptions of vegetable oils and social media, in this paper we have investigated a corpus of over 20,000,000 tweets on these oils, including a subset of tweets in 102 languages published between March 2006 and December 2021. Our results indicate that discussions about coconut, olive, and palm oils are the most prevalent on Twitter, even though their actual supply is much lower than other oils, such as soybean or sunflower. In terms of growth, we observe that the evolution of olive oil tweets and palm oil tweets (at least until 2018) are highly correlated with the overall growth of Twitter, in contrast with coconut oil. In the early 2010s, coconut oil gained a lot of attention. Many labeled it as a “superfood” as many media outlets affirmed that it could help in weight loss, together with several health benefits49,50, which could explain the rapid growth in the number of tweets. But the fad started to decay after reaching its peak in 2016. Besides, the public perception of this oil may have changed after the publication of a review on dietary fats by the American Heart Association in 201751,52, leading to its decline in terms of attention.

Similarly, palm oil shows stable growth, with few bursts of activity for over 10 years. The situation completely changed on November 9, 201853. Previously, in April, the UK supermarket company, Iceland Foods, announced that they would remove palm oil from all their products in an effort to fight against deforestation. The campaign reception was generally positive but relatively small. During summer, Iceland became aware of an animated campaign film made for Greenpeace featuring a baby orangutan that had lost its mother and its forest due to palm oil. They asked to use the film as their Christmas TV advertisement and booked airtime for November. However, one week before its release, the cable company did not approve the advertisement due to political advertisement issues. On November 9, when the advertisement should have been broadcast, the company released the commercial on its website and several social channels. They also explained that it had been banned from TV. This led to an extreme public reaction, with the video becoming viral, showcasing the Streisand effect (when an effort to censor information backfires by increasing awareness of that information)54. This permanently altered the dynamics of the public debate, with the annual growth after 2018 well above the ones before the campaign. Notably, these discussions are mainly centered in the UK and the US, together with palm oil-producing countries.

An analysis of the content of the tweets suggests that discussions about coconut and olive oils tend to focus on health and beauty for the former and on health and food for the latter. Conversely, discussions about palm oil are primarily centered on environmental issues. The sentiments associated with the first two oils tend to be positive, while tweets related to palm oil are mostly negative. These distributions are fairly constant across time, except for some viral events. For instance, in the case of coconut oil, there was a large spike of positive tweets in March 2019. This is due to a joke tweet that went viral, with over 86,000 retweets, which explains why the event is mostly positive. For palm oil, instead, there was a spike in all sentiments in November 2018. These tweets are related to the environmental campaign carried out by Iceland Foods and Greenpeace. Even though the overwhelming majority are labeled as negative, there are also some tweets in the other categories, signaling that the event spread through several conversations and points of view. Furthermore, the viral event of coconut oil did not have long-lasting impacts, while the one of palm oil altered public perception up to the last day of data (see Fig. 1c). The lack of social media impact from a key publication by the IUCN - the largest conservation organization in the world - is food for thought for scientists seeking to use science and social media to influence public perceptions on controversial and polarized topics. The powerful amplification potential of social media may not apply to nuanced scientific information, possibly because herd behavior on social media55 is driven by emotive reactions to positive or negative rather than neutral messages. It is difficult to get emotional about a finding that oil palm can be good or bad depending on the context4.

A comparison of scientific discourse with online discussions reveals striking differences. While social media virality exhibits intense peaks of interest, it is also characterized by fleeting attention spans. The ephemeral nature of these events poses a challenge to the development of effective policies aimed at moderating online behavior and countering the spread of misinformation. Addressing this challenge requires the development of strategies that are capable of addressing events in a timely manner. However, it is important to recognize that discussions about relevant issues, such as the sustainability of vegetable oils, should not be driven solely by these viral events. Instead, we advocate for a more transparent and accessible approach to engagement within the scientific community, which could foster greater public trust. In addition, leveraging insights from psychology, particularly behavioral nudging techniques, is essential to effectively translate nuanced scientific findings into changes in public perception and consumer behavior. In particular, there is a discernible preference among conscientious consumers for simplistic narratives, such as categorizing oil crops as either “good” or “bad,” to satisfy their desire for ethical decision-making. However, such binary views fail to capture the complex sustainability dynamics underlying these decisions, highlighting the importance of bridging the gap between public perceptions and the on-the-ground realities of social and environmental impacts.

We conclude the analysis by focusing on the diverse characteristics of viral events concerning each oil. Topic popularity and interevent time distributions tend to be very fat-tailed45,56,57, and this pattern is also present in the debate on vegetable oils. We observe events reaching a large number of people in a short period and others having a slower, more sustained spread. In particular, hashtags associated with olive oil generally have a slower spread and more predictable dynamics. In contrast, those associated with palm oil are more likely to be viral, with a larger variance in both the size and timing of their spread. We also verified a strong relationship between sentiment and virality. Hashtags related to environmental issues and negative connotations were more likely to be viral, and hashtags associated with marketing campaigns were more likely to have a faster spread but shorter duration. This has important implications since it has been observed that political messages containing moral-emotional language tend to spread more within ideological groups and less between them58. Future developments of this work could include the replication of this analysis through different social media platforms, as it could be useful to compare the different user bases of such platforms.

Conclusions

In summary, these findings suggest that scientific discourse sharply contrasts with online discussions on Twitter, with different oils being associated with different topics and sentiments. Most oils go unnoticed in the overall discussion, while others are only associated with health or nutritional aspects. Only palm oil is discussed in terms of sustainability, and the conversations tend to be mostly negative, even though the cultivation of other oils may also have important environmental implications59. Indeed, it is crucial to note that the intensification of olive groves has posed a threat to biodiversity in Spain, resulting in the destruction of wintering bird communities’ refuges60. Furthermore, research has indicated that coconut oil is more dangerous for a higher number of species compared to other oils, followed by olive and palm oils61. In addition to the lack of topic variety, palm oil comes across as substantially more susceptible to viral events than other vegetable oils. This is another symptom of the lack of uniformity in the discussion around seemingly similar topics. Furthermore, the discussion around the sustainability of palm oil is an interesting example of how an awareness-raising campaign can bring the general public’s attention to hitherto little-addressed issues and how the environmental issue is becoming preponderant in public discussion. As such, future awareness campaigns may want to highlight some of the strengths and weaknesses of this and other oils to provide consumers with the adequate tools to address issues as complex as the SDGs. Practitioners could benefit from the insights provided by this work for calibrating future marketing campaigns. To attain sufficient visibility for crucial topics such as the Sustainable Development Goals, it is essential to disrupt the prevailing statistical patterns that typify viral events. By mitigating the impact of short-lived, bursty events on social media discourse, we can enhance the longevity and engagement of public opinion in sustainability-related discussions.

Methods

Basics of Twitter

Twitter (now X) is a microblogging and social network platform in which users can post and interact with short messages known as tweets. Tweets were originally limited to 140 characters, but in November 2017, the limit was doubled to 280 characters. Pieces of text that start with # are called hashtags, and usernames preceded by @ are known as mentions. These allow users to group their messages into topics and to address other users directly. Users may also follow the activity of other users, even though this relationship does not have to be reciprocal. Lastly, users may repost messages from other users and share them with their followers. This type of message is known as retweets62. It’s important to note that since the data collection for this work, the platform has undergone a serious rebranding, changing its name to X, and consequently, all other terms containing the word “tweet”, such as “retweet”, are now referred to as “repost”.

Data collection

We used Tweepy v4.463 to connect to the Twitter API v2 with Academic Research access. We downloaded all tweets related to a selection of vegetable oils published between March 21, 2006 (the beginning of Twitter) and December 31, 2021. The Academic license allows access to the whole public tweets database, with a limit of 10 million tweets per month. We used as query bigram “type oil”, where type is either canola, coconut, corn, cottonseed, olive, palm, peanut, rapeseed, soybean, or sunflower. Note that queries are case-insensitive. This yielded a corpus of over 15 million tweets, which was later complemented with tweets containing the term “palm oil” expressed in 102 languages (see Table S3 in the Supplementary Information for the exact terms), expanding the corpus to over 20 million tweets. Furthermore, according to the Oxford English Corpus, we also obtained the number of tweets per day containing any of the 100 most common English words to estimate Twitter’s growth (see Table S4 in the Supplementary Information for the list of words).

Twitter analysis

Twitter growth

For each oil, we compute yearly growth as the annual number of tweets over the same value in a reference year. In 2006, several oils were not discussed in the platform. As such, we take 2007 as the reference year. In the Supplementary Information, we show the results for other choices, yielding the same qualitative behavior (Fig. S3).

Geolocation

Twitter allows users to include their location in their tweets, which used to be a place or exact coordinates. Since 2019, the latter is no longer available. Besides, this option is disabled by default. As a result, it is estimated that less than 1% of tweets are geo-tagged64. In our case, we found 64,682 geo-tagged tweets in the set of 7,946,915 tweets containing “palm oil” in any of the languages under consideration. We extracted the country of the place associated with the tweet directly from its metadata to estimate where they were produced65.

Hashtags

Hashtags can be used to identify the most common keywords or topics related to a set of tweets. To obtain the most relevant hashtags, first, we remove the trivial hashtags in each set: #oliveoil, #olive and #oil; #coconutoil, #coconut, and #oil; and #palmoil, #palm, and #oil. Then, we extract the 10 most common hashtags for each set of tweets.

Latent semantic analysis

Latent semantic analysis (LSA) is a natural language processing (NLP) technique based on the assumption that words that are close in meaning will occur in similar pieces of text. This allows the visualization of documents according to topics based on their content in an unsupervised way. That is, topics are not defined a priori. We performed this analysis using scikit-learn v1.0.266. More specifically, we selected the corpus of 15,359,185 tweets containing olive, coconut, and palm oil. Then, we obtained the 3000 most representative words in this set and created a 15,359,185 × 3000 matrix. Each term of the matrix contains the so-called term frequency-inverse document frequency (tf − idf) that weights words according to how common they are in a document but penalizes those words that are very common across the whole set of documents. Lastly, we applied the LSA technique to reduce the number of dimensions to 2, which is the preferred choice for visualization purposes. Thus, it is, in general, not possible to associate a particular subject or discourse with them. See the Supplementary Information for further details.

Sentiment analysis

Sentiment Analysis is a supervised machine-learning technique that relies on a pre-trained model to determine if the expressed opinion in a document is positive, negative, or neutral. In particular, we employed the Twitter-xlm-roberta-base-sentiment model. This model was pre-trained on a corpus of almost 200 million tweets in 30 languages and fine-tuned for sentiment analysis67. For each tweet, the model assigns a score to the labels negative, neutral, and positive. For simplicity, we chose the label with the highest score to classify its sentiment. See the Supplementary Information for further details.

Virality characterization

To characterize virality, we focused on two important characteristics of information cascades or avalanches. The first one is the interevent time, which measures the period between two tweets containing the same hashtag. The second one is the cascade size of a certain topic, which measures its popularity.

Interevent time (IET)

The IET associated with a certain hashtag is the distribution of the time gap between each pair of tweets containing the said hashtag. We fit these data to a power-law with a probability density function:

$$p(\tau )\propto {\tau }^{-\alpha },$$
(1)

where α is known as the scaling parameter or scaling exponent, which determines the characteristics of these heavy-tail distributions47.

Cascade size (CS)

In our context, an information cascade can be broadly defined as the series of tweets with a given hashtag posted after an initial tweet containing it. We define the CS distribution of a certain hashtag as the distribution of the number of tweets posted every day using that hashtag. We fit these distributions using the same heavy-tailed distribution defined in eq.(1).

We fit the IET and CS distributions using the Powerlaw v1.5 package68 (see the Supplementary Information for further details, Figs. S5-S10 and Tables S5-S10).

Virality regions

A power-law has a well-defined mean only if α > 2, and it has well-defined variance only if α > 369. This distinctive characteristic of these distributions allows us to group hashtags according to the specific values of their scaling exponent for the CS (αCS) and IET (αIET distributions). In Fig. 5, we observe that our hashtags may fall into 6 different regions:

  • Region I (2  ≤αCS < 3 & αIET < 2): virally big, unpredictably fast regime. The average CS is finite, while the average and higher moments of the IET distribution diverge in the infinite size limit. A large variance of the CS signals that, even though the average is well-defined, there might be exceptionally large viral events. The large average and variance of the IET distribution indicate that these events may be very fast but rare.

  • Region II (2 ≤ αCS < 3 & 2≤αIET < 3): viral regime. This is the typical signature of viral events. Both averages are finite, but the divergence of the variance (in the infinite size limit) indicates that they may be bursty and large.

  • Region III (2 ≤ αCS < 3 & αIET≥ 3): virally big, unvirally slow regime. The variance of the CS is large, indicating that there might be exceptionally large events, but they occur at predictable rates.

  • Region IV (αCS≥ 3 & αIET < 2): unvirally small, unpredictable fast regime. In contrast with Region I, events in this category have a well-defined size and do not enter into the viral regime.

  • Region V (αCS≥ 3 & 2≤αIET < 3): unvirally small, virally fast regime. Events in this regime have well-defined cascade sizes, but they are not regularly active and a large amount of time may elapse between successive cascades of the same hashtag.

  • Region VI (αCS ≥ 3 & αIET ≥ 3): unviral regime. Both the average and variance of the distributions are finite, and thus, hashtags that lie in this regime can be labeled as non-viral.