1 Introduction

Tracking progress is of key importance for climate action: it builds necessary trust between stakeholders; is a necessary pre-requisite for accountability and allows the climate change community to learn the lessons from existing efforts, as well as to identify where additional action is needed (Olhoff et al. 2018; Weikmans et al. 2021; Gupta and van Asselt 2019; Milkoreit and Haapala 2019). Recognising this importance, the Paris Agreement established the Global Stocktake (GST). This contains a technical and a political component. On the technical side, the GST is an assessment of different sources of information to assess the current progress towards climate mitigation and adaptation ambitions; politically, the GST fits into the Paris Agreement’s broader “pledge-and-review” architecture, so countries should in theory increase their ambitions based on the outcomes of the GST (Craft and Fisher 2018; Roelfsema et al. 2020; Milkoreit and Haapala 2019). The first stocktake concludes at the UNFCCC COP28 in Dubai 2023.

With over 170 thousand pages of documents to be considered (UNFCCC 2023), the scale of the Stocktake is daunting. Submissions are also highly varied, as they are made by a wide variety of Party and non-Party actors (“Party” here refers to countries which have signed the UNFCCC), and they can be in any of the United Nations (UN) languages. Consequently, on the political side, negotiators are meant to make decisions on—and draw lessons from—a body of work they cannot be expected to read fully. On the technical side, the co-facilitators of the GST and the UNFCCC secretariat were left with the unenviable task of producing a synthesis report of all inputs to the GST, which has just been published (UNFCCC 2023).

In the report, they highlight 17 key findings, grouped into 4 major themes, as specified by Article 14 of the Paris Agreement: mitigation, adaptation including Loss and Damage, means of implementation including finance, preceded by a more general context section. Annex I of the report describes in one paragraph per theme what data sources were considered.

Clearly, providing such a synthesis in a transparent manner is a major challenge. Yet being clear about what is included and why, in our view, is crucially important, especially given the political nature of both the stocktaking process and of the submissions which form the basis of the GST (Gupta and van Asselt 2019; Weikmans et al. 2021; Christiansen et al. 2020).

To make matters even more complex, the GST is meant to take place every 5 years and should therefore be repeatable. In practice, given the scale, the lack of an agreed-upon methodology, and the likelihood of continued political pressures, it remains unclear how future stocktakes can be conducted in a comparable manner. This would make it more difficult to assess progress on climate action over time.

In this Letter, we argue that data science methods, in particular the use of Natural Language Processing (NLP), can help resolve these issues: such methods have proven to be a relatively transparent way to handle large volumes of varied information, while allowing also for easy replication and updating (Falkenberg et al. 2022; Schaefer et al. 2023; Lesnikowski et al. 2019). We focus primarily on the technical component of the GST but will highlight the political relevance of the results where appropriate. Note, however, that the political negotiations are still ongoing at the time of writing.

2 A case study for multilingual topic modelling

NLP refers to the use of data science methods, in particular machine learning and artificial intelligence (AI), to extract and analyse information from text-based data. Here we use topic modelling which allows us to identify clusters of documents with a similar content (i.e. it assumes that documents which use similar words are discussing a similar topic) and describe these clusters with distinctive keywords (i.e. words that are occur relatively often in a given cluster are a good proxy for the topic content). For further explanations, see Lesnikowski et al. (2019) or Asmussen and Møller (2019). The relevance for the GST is straightforward: it allows the user to trace exactly which documents contribute to any given topic.

The topic modelling method used here is BERTopic (Grootendorst 2022). This has two main advantages: first, this method uses state-of-the-art transformer-based models. Such models create so-called “embeddings”, which are mathematical representations of the meanings of words; because the models are trained on large sets of documents, these embeddings can encode relatively rich and context-specific meanings of words. Second, and crucially for our case, embeddings can be multilingual, meaning that the same input sentence in different languages results in near-identical embeddings. In particular, we use a multilingual version of Sentence-BERT (Reimers and Gurevych 2020), which supports all of the United Nations Languages.

To create our dataset, we take the Global Stocktake Dataset from Climate Policy Radar (Climate Policy Radar 2023), as this contains all inputs to the GST with relevant meta-data. We download the documents and extract text with Google Tesseract, filtering out title pages, tables of contents and tables as much as possible. Long paragraphs and texts where paragraphs could not be detected are segmented into blocks of around 100 words. In doing so, each paragraph will likely discuss only one or a few topics.

Table 1 shows that the great majority of the paragraphs (90.0%) are submissions from countries, often in the form of national reports. At various stages in the GST process, submissions on a particular issue were requested, including by various non-governmental organisations, United Nations bodies and the most recent IPCC reports. Lastly, some documents were submitted by the UNFCCC secretariat; the largest group within these texts (n = 893, 45.1%) are synthesis reports, often created at the request of Parties on a specific subject. We remove documents with a procedural focus—primarily technical assessments of country reports by the UNFCCC secretariat (n = 97), of which the limited non-procedural content is represented in the data already by the country reports themselves.

Table 1 Number of paragraphs in the final dataset, subdivided by language, author type and World Bank region. Categories are mutually exclusive. For the region, only country submissions are considered. Joint submissions from within a region (e.g. European Union) are also counted, but they are left out if the submission spanned multiple regions (e.g. Group of 77). The most prominent types of submissions are also given for the different author types, grouping together the major reporting requirements established pre-Paris Agreement (NC, National communications; BUR, Biennial Update Report; BR, Biennial Report; note these cannot be separated as they are sometimes submitted jointly). NDC, Nationally Determined Contribution under the Paris Agreement; IPCC, Intergovernmental Panel on Climate Change. Party here stands for Party to the Convention, meaning countries

In line with best practice (Chang et al. 2009; Müller-Hansen et al. 2020), we trial models with between 30 and 100 topics to see what number of topics provides sufficient detail for our analysis without duplicated or “junk” topics. Based on expert judgement, we find the optimal number of topics is likely in the range of 60–80 topics, for which we trail additional topic models with a variety of hyper-parameters, before settling on a final model with 72 topics. We then provide a descriptive name for each topic based on the 10 most frequent terms for this topic and their most-closely associated paragraphs. Seventeen of these topics are so-called guided topics, meaning we prompted the model to find clusters of documents around a priori specified keywords. Our guided topics are based on the 17 key findings of the GST synthesis report, which are listed in the report’s summary for policymakers (see Supplementary Materials for more details on our methods).

We use the resulting topic model in two ways: first, we determine which topics are important for different stakeholder groups to identify political priorities. Second, we compare the results to the official technical Synthesis Report (UNFCCC 2023) by identifying which topics closely match the key findings of the report and how the most-closely related paragraphs in the model discuss the same topic. To find closely matching paragraphs, we remove the process-focussed paragraphs of the report by hand and test similarity of paragraphs in the submission texts based on cosine similarity of embeddings (Reimers and Gurevych 2019). In essence, this functions as a proxy for which (types of) submissions ended up influencing the report the most. Note, however, that the Synthesis Report is not only based on text: as part of the GST, discussions and workshops were held which are not present in our input data.

3 Results

The results show a wide mixture of topics, reflecting the variety of GST inputs (see Table 2 and Supplementary Materials for associated keywords). Topics centred around geographic place names are not considered in the below.

Table 2 Named topics in our topic model grouped based on major themes in the synthesis report. Topics where the most-associated words match keywords supplied for the guided topics have been marked with in red. Please see the supplementary materials for the most-associated words and paragraphs per topic. Acronyms: EU, European Union; IPCC, Intergovernmental Panel on Climate Change; NAMA, Nationally Appropriate Mitigation Actions; UNFCCC, United Nations Framework Convention on Climate Change

3.1 Different languages and regions have substantially different priorities

The GST co-facilitators aimed for inclusivity, inviting submissions and other inputs from a great variety of stakeholders and in any of the UN languages. Here, we use our topic model results to investigate if we can trace this diversity. We find multiple non-English topics, including topics where the most-associated keywords are a mixture of languages. The Food security topic is a good example: it includes “securité alimentaire”, “food security” and “seguridad alimentaria”, which is the same term in Spanish, English and French respectively. Overall, this is an encouraging sign for the quality of our model, indicating that the embedding model broadly succeeds in creating embeddings that are multilingual.

Figure 1a shows the influence of different languages on the topics. One of the more striking differences is the focus of English language topics on mitigation and plans more broadly, with Net zero finance in particular being almost exclusive to English language documents. This appears to be driven by non-Party submissions form both the Secretariat and external stakeholders. The net zero topic is an outlier in this regard, as documents from the Secretariat generally emphasise procedural issues, such as the Kyoto Protocol and relatively specific terminology, such as those around the various Adaptation plans (see the Supplementary Materials Figure SM2-3).

Fig. 1
figure 1

The most overrepresented topics in the topic model. In a the results are plotted on a logarithmic scale such that a score of 2 means that the topic occurs twice as often in documents written in the given language, relative to all other documents. Arabic language documents are not considered due to insufficient data. b gives the topic labels for the most overrepresented topics for each World Bank region and shows the number of paragraphs from each country, with each region plotted separately. For both plots, topics with a clear country focus are left out. Acronyms: NAMA, Nationally Appropriate Mitigation Action; IPCC, Intergovernmental Panel on Climate Change

Politically, it is interesting to ask if different actors also represent different positions. Based on the past negotiations (IISD 2023a, 2023b), there are at least two broad categories of interest: mitigation versus adaptation, where the Global North broadly prioritises the former and the South the latter. Second, forward- versus backward-looking, where there are three broad camps: (1) historically high emitters want the GST to focus on future plans and best practices; (2) more recently industrialised countries prefer to eschew the topic of future mitigation, pushing instead for a retroactive stocktake of why actions to date have been insufficient and (3) many of the most-vulnerable countries think both historical responsibility and future actions should be discussed comprehensively.

Looking at the regional differences in country submissions (Fig. 1b), it becomes clear that Europe & Central Asia as well as North America focus more on plans and scenarios. For the EU, there are additional planning-related topics on mitigation specifically (National- and EU registries, EU emissions). The topics of documents in the Russian language are perhaps the most similar, but they discuss economics generally, rather than solutions per se. Overall, planning, financial and mitigation centred topics are over-emphasised in submissions from the Global North. This is in line with their push for a forward-looking GST.

By contrast, French and Spanish language submissions place a larger emphasis on impacts and vulnerabilities, which is particularly true for submissions from Africa and the Middle East. The two most overrepresented Spanish topics are notable too. Pastoralism is almost never mentioned in other submissions and is almost entirely driven by the submissions of Argentina, which repeats phrases around emissions from pastures and cattle rearing in nearly identical phrasing throughout its reports. The topic around National Appropriate Mitigation Actions (NAMAs) and Coffee is shared more widely in South American submissions. Both topics highlight how relatively specific issues can be of high domestic and regional interest.

Asian submissions, although almost always made in English, appear to stress different topics. Disaster risk for example is overrepresented which reflects the vulnerability of this region to extreme weather events, including hurricanes and typhoons. Perhaps relatedly, Capacity building is only in the top 5 most overrepresented terms in East Asia and the Pacific.

Finally, in an absolute sense (Fig. 2), it is notable that submissions from countries are generally more concerned with descriptive topics than with solutions—i.e. there are an order of magnitude more paragraphs discussing impacts and vulnerability topics than there are paragraphs on adaptation action; topics on mitigation action are less frequently discussed than topics describing emissions, especially for Latin America and Africa. Outside of Europe and Central Asia, most solutions-oriented topics deal with finance, capacity building or technology. As many of these countries are net-recipients of climate funds and support, it is in their political interest to highlight these.

Fig. 2
figure 2

Heatmap of the occurrence of paragraphs by topic and World Bank region. The percentage value is normalised by column—i.e. it shows what share of paragraphs from the region cover a given topic, which is a proxy for regional priorities. This colour scale is logarithmic. The topic groups correspond to the major themes of the Synthesis Report, which we break down slightly further to highlight topics with a focus on climate action (as opposed to emissions and impacts), as well separating out non-financial means of implementation, such as capacity building. The final row and column contain the total number of paragraphs per region and topic as an absolute number. For a more disaggregated view, see Figure SM1

Overall, North–South differences are fairly straightforward, with mitigation being much more strongly represented in Northern submissions and impacts being more represented in the South. This suggests again that documents are used to highlight political priorities. Note finally that submissions from non-Party stakeholders are more closely aligned with submissions from Southern countries, discussing issues like Food Security and impacts on the Sea & ocean relatively often.

3.2 Technical summary favours global summaries over local topics

A main outcome of the technical component of the GST is the Synthesis Report (UNFCCC 2023). Since this report provides the basis for further political negotiations, it is important that it covers all main topics of the GST submissions. In particular, we investigate if the key messages of the report can be traced back to the guided topics in our model.

We find that 13 of the 17 guided topics that represent the key findings from the GST Synthesis Report can be clearly identified. Because of the stochastic nature of the method, it is difficult to be certain why the other 4 topics are no longer found in the final model, but it there are at least two reasons: either there were not enough submissions discussing the topic in depth with similar phrasing to the report or the key findings were not sufficiently distinguishable from other topics: key findings 9, 11 and 13 which all discuss adaptation. Similarly, key finding 7 is on the need for a “just transition, but the resulting topic focusses entirely on youth, which was only one of the 5 input keywords.

As briefly noted above, there are a few topics which are highly regional specific, but these generally do not appear in the Synthesis Report. Health impacts, for example, are mentioned a few times in the report as part of a list of items which countries discuss, but the topic model suggests that the majority of health-related texts are explicitly related to infectious diseases, which the synthesis report does not mention directly. There is a likely political reason for this: the Paris Agreement explicitly states that the GST reflects on collective progress, not the progress of individual countries or regions.

Correspondingly, when we look for paragraphs in the submission texts which closely match the paragraphs in the Synthesis Report (Fig. 3), we find that the language most resembles submissions from the UNFCCC secretariat (relative to their small volume of inputs) and other non-Party Stakeholders, even though they only make up a combined 10.8% of submissions. The latter group includes the IPCC, which is in places quoted (near) verbatim—e.g. paragraph 30: “There is a rapidly closing window of opportunity to secure a liveable and sustainable future for all” is a direct quote from the IPCC synthesis report (2023, par. C.1). Similarly, key finding 10 states “most observed adaptation efforts are fragmented, incremental, sector-specific and unequally distributed across regions” which is taken directly from the same (par. A.3.3).

Fig. 3
figure 3

Origin of paragraphs in submissions which closely match paragraphs in the final synthesis report. In both plots, the colours represent the different main sections of the report. The 20 most-closely matching paragraphs are included with a minimum cosine similarity of 70%. In a, the percentage of most-closely matching paragraphs is given per region. In b, the same is given, but scaled to the number of total submissions—i.e. if the Synthesis Report would represent all inputs equally, all values would be 1; higher values mean a relative overrepresentation, lower values an under-representation

This observation is most evident for the “Context” section of the GST report and least pronounced for paragraphs from the “Adaptation” section, which resembles country submissions from Sub-Saharan African and Europe & Central Asia. Relative to their number of submissions, South Asia is also well represented here too. The “Mitigation” section of the report instead leans heavily on non-Party submissions, but also closely resembles European submissions. Considering that North America only contains three countries and therefore has relatively few submissions, their perspective on this issue is also relatively overrepresented.

By comparison, the skewness in language of submissions (Figure SM4) is less pronounced but still notable: non-English languages are uniformly underrepresented, even when accounting for their relatively smaller size. In part, this may be because the overrepresented non-Party submissions are generally made in English. Conversely, this means that non-English submissions are often made by countries; as these include regional priorities, they are less reflected in the report.

4 Conclusion

The GST is complex process with multiple sources of evidence being integrated into a synthesis report as input for political negotiations. Here, we focussed on the large body of text submissions, which were the basis of the technical component of the GST. In broad terms, the major themes identified in our study match the main themes of the GST synthesis report, which is encouraging.

The GST was established under the Paris Agreement, which signalled a broader political transition towards climate solutions (Sun et al. 2022; Weikmans et al. 2021). However, in line with the GST report and other analyses (UNEP 2022a, 2022b; Wright et al. 2023), our findings suggest that there may be a “solution gap”: there are noticeably less solutions-oriented topics than descriptive topics on either emissions or impacts and vulnerability. A manual analysis is better suited to unpack if the quality of these paragraphs can compensate for the quantitative under-representation. This highlights a clear limitation of the method adopted here: the frequency of words is taken as a proxy for importance and all pieces of text are weighted equally, which is not always appropriate.

The results also reveal that the content of submissions to the GST varies widely. This makes the underlying data a potentially rich source of information for policy makers. In theory, this information should be contained in the technical component of the GST, but we find evidence of inequal representation in the Synthesis Report, especially for countries from the Global South and non-English submissions. There may be several reasons for this, including resources needed to translate evidence, regional specific issues and structural inequalities in the design, funding and implementation of climate action translating to long-standing inequalities in evidence (Callaghan et al. 2020). Additionally, although we see inequal representation, it is unclear if this is due to biases within the UNFCCC process, or because the report also considered workshops and oral inputs, which are not included here.

Generally, documents from countries match their political agendas: Global North submissions tend to be more forward-looking and mitigation focussed. Southern documents emphasise climate impacts and vulnerability. As the UNFCCC primarily provides a political forum, this is unsurprising (Weikmans et al. 2021; Wright et al. 2023), but it does pose a significant problem for the technical component of the GST: it should provide a general and factual synthesis, but it needs to base its findings on inputs which are specific and not politically neutral.

Our results suggest that two strategies were employed to increase the chances that the Synthesis Report receives the required recognition at COP28. First, the report leans on existing syntheses from both the UNFCCC Secretariat and non-Party stakeholders like the IPCC. These reports have often already been recognised by the UNFCCC, so if a country would object, the co-facilitators can point out that the same findings were already accepted by the country previously. Second, the report mostly leaves out topics with a clear regional focus, thereby avoiding accusations of favouritism. This fear is far from imaginary: some countries for example opposed a “technical annex” to the report with concrete examples of best practices on the basis that it could never be complete, so the selection of the cases would necessarily favour some Parties over others (IISD 2023b). Whilst we understand why these routes were taken, it does raise important questions: what is the added value of the GST in this form? How does this help us to answer the question of “are we on track with climate action” better than prior efforts like those of the IPCC for example? And is the information specific enough to pinpoint where and how we have to “ratchet up” climate action?

To be clear, such doubts do not mean the GST is without value: the process provides legitimacy, which can help spur action (Milkoreit and Haapala 2019). But it is notable how the initially high ambitions of the GST—and perhaps the Paris Agreement more broadly (Sachs 2019, Sun et al. 2022)—over time were constrained by the countries themselves.

This brings us back to the added value of a data science approach to the Global Stocktake. As we show here, modern AI methods can help identify key issues across the full range of GST submissions, including non-English language texts, and large volume of unstructured text. Because it is comprehensive and relatively easy to replicate, such an analysis could function as a quantitative substantiation for including certain topics in technical summaries going forward. We have chosen here to highlight language- and regional differences, in part to underline our methodological contribution of multilingual topic modelling. Overall, more qualitative and critical assessments will continue to be critical too. Methods such as topic models cannot fully automate evidence synthesis. Still, they can be useful tools to inform critical inquiry in a transparent way. This has political advantages: if a given finding does become controversial, users can trace the source of that statement much more easily than currently. Moreover, the replicability of computer-based methods also means that successive iterations of the GST could better build upon the current findings, especially if the methodological details are made publicly available.

Taking stock of global climate action remains a pressing and worthy goal. Some impediments will require political courage to overcome, but AI can at least assist with solving some of the practical issues, making the process more transparent, timely and efficient.