Topic modelling the mobility response to heat and drought

We conducted a systematic literature review of peer-reviewed full text articles on the nexus between human mobility and drought or heat published between 2001 and 2021, inclusive. We identified 387 relevant articles, all of which were analysed descriptively using a dictionary-based approach and by using an unsupervised machine learning–based Latent Dirichlet Allocation (LDA) model. Most articles were in response to droughts (71%), but heat and extreme temperature became more prominent after 2015. The drought-related literature focuses geographically on African and Southern Asian countries, while heat-related research has mainly been conducted in developed countries (mostly in the USA and Australia). For both hazards, European countries are under-represented. The LDA model identified 46 topics which were clustered into five major themes. One cluster (14% of all articles) included literature on heat-related mobility, mostly data-driven models, including amenity migration. The other four clusters included literature on drought, primarily on farming societies and the agricultural sector with three of those clusters making up 63% of all articles, with the common overarching focus on climate migration and food security. One of the four drought clusters focused on social dysfunction in relation to droughts. A sentiment analysis showed articles focusing on voluntary mobility as part of adaptation to drought and heat were more positive than articles focusing on migration triggered by droughts and heat. Based on the topics and the article characterisation, we identified various research gaps, including migration in relation to urban droughts, heat in farming societies and in urban societies of developing countries, planned retreat from hot to cooler places, and the inability or barriers to doing so. More research is also needed to understand the compound effect of drought and heat, and the social and psychological processes that lead to a mobility decision.


Introduction
Climate change is accelerating with average global temperatures predicted to exceed pre-industrial levels by 2.7 °C by 2100, more than one degree above the limit set in Paris in 2020 (Olhoff and Christensen 2021). The consequences of climate change include more frequent and intense extreme weather events, often occurring together or intensifying each other (Zscheischler et al. 2018). Amongst the hazards that are frequently concurrent are droughts and heatwaves, which occur over various timescales (Alizadeh et al. 2020;Ribeiro et al. 2020), and are increasingly being considered drivers of human mobility (Mastrorillo et al. 2016;Mueller et al. 2020). The literature on this field has been expanding rapidly (Zander et al. 2022) to the extent that it can now be reviewed to understand the scope of the research already conducted and the gaps where further research would appear to be warranted.
We therefore systematically review the broad literature on drought and heat on human mobility to provide insights into the context in which articles treat mobility in relation to heat and drought, the scope of the articles, and their polarity (sentiments) concerning the issues emerging around this mobility. We also provide a summary of the spatial and temporal distribution of studies and derive research gaps. Understanding the emphases and gaps of past research in this area is of increasing importance because, while fast-onset hazards often trigger temporary movements with subsequent return migration (Black et al. 2011), slow-onset hazards can irrevocably change human habitats and trigger permanent migration, undermining livelihoods and food security (McLeman 2019; Ulus and Ellenblum 2021;Zickgraf 2021). The cumulative impacts of persistent heat and drought further reduce the opportunities for adaptation through other means, making migration inevitable (McLeman 2019).
Heat can be both a sudden-onset hazard, with intense heat waves usually lasting only a few days, and a slow-onset hazard, with many regions and countries recording continuously increasing average temperatures. Extreme heat over long periods is a public health concern, causing heat stress and related illness (Capon et al. 2019;Ebi et al. 2021) and a decline in well-being and productivity (Zander et al. 2015;Andrews et al. 2018). Excessive heat is regarded as "one of the most underappreciated hazards of climate change" (Nature Editorial 2021) and the temperatures in some regions in the world already regularly exceed health thresholds (Horton et al. 2021). Eventually some places are likely to become uninhabitable since adaptation to heat is constrained by physiological limits to cooling (Sherwood and Huber 2010; Asseng et al. 2021). Migration to cooler places thus becomes a likely strategy to reduce negative health implications and a decline in well-being (Cattaneo and Peri 2016;Zander and Garnett 2020).
Droughts are characterised by prolonged dry periods when precipitation is well below average and are increasing in intensity and frequency as the climate changes (IPCC 2021). They are usually considered to be a slow-onset hazard, although this depends on the discipline (Singh et al. 2021). Droughts have their greatest effect on farming (Shiferaw et al. 2014) and have the potential to compromise global food security (Elliott et al. 2018). While heat affects farmers' personal health and well-being and that of their workers (Budhathoki and Zander 2019), drought reduces and damages agricultural production and livestock productivity (Ribeiro et al. 2020). There is a large body of literature on how farmers adapt to droughts by modifying farm processes and inputs (e.g. Gautier et al. 2016). However, recurring droughts can also mean that farmers are sometimes unable to adapt their farming and need to migrate to other areas where off-farm work might be available (Hermans and McLeman 2021). These moves can vary in duration; although often considered dichotomously as temporary or permanent , drought-related movement can also be seasonal or circular (Findley 1994;Rademacher-Schulz et al. 2014).
Migration has been categorised as the climate change adaptation strategy of last resort if in-situ adaptation fails (Black et al. 2011) and occurs on a continuum between planned and forced movement ). In the last decade, research on climate adaptation has flourished (Dayeen et al. 2020;Callaghan et al. 2021), including on climate migration. Recent studies have reviewed the impact of climate change on migration, either through non-systematic methods (e.g. Cattaneo et al. 2019;Hauer et al. 2020), meta-analyses (e.g. Hoffmann et al. 2020;Beine and Jeusette 2021;Šedová et al. 2021) or systematic analyses (e.g. Piguet et al. 2018;Hoffmann et al. 2021;Thalheimer et al. 2021;Zickgraf 2021;Zander et al. 2022).
Many of the existing systematic literature reviews rely on hand-coding to summarise and categorise the resulting articles. This has two major limitations, including the limited volume of articles, in terms of numbers and length that can be reasonably analysed manually, and the challenge of subjectivity and existing prior beliefs of researchers that might influence the hand-coding process (Lesnikowski et al. 2019). Reviews of research can also be indicative of the salience of the issues to the societies and the political environment in which the research is being conducted Sietsma et al. 2021). Such biases can be avoided by the unsupervised machine learning text mining method, topic modelling, which has, with the advancements in computer power, been applied to analyse large numbers of documents and/or documents of great length (e.g. Callaghan et al. 2021;Sietsma et al. 2021;Berrang-Ford et al. 2021). In the context of climate change-related hazards and mobility, only a few studies have applied unsupervised text mining methods (Thalheimer et al. 2021;Zander et al. 2022).
We applied topic modelling to articles that were obtained through a systematic review of peer-reviewed articles following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) approach (Liberati et al. 2009). Our study sets itself apart from existing systematic reviews in two ways. First, we focus on two specific slow-onset hazards. Most existing reviews do not focus on individual hazards but are on climate change per se. While Hauer et al. (2020) reviews the literature on sea level rise and migration, and Zickgraf (2021) on sea level rise and drought, no review has yet considered the two related slow-onset hazards, drought and heat, and their link to mobility.
Second, we analyse the full text of each article, not only the metadata. Most unsupervised text mining studies, which sometimes analyse thousands or hundreds of thousands of articles, only analyse abstracts, titles, and keywords (e.g. Dayeen et al. 2020;Berrang-Ford et al.2021;Callaghan et al. 2021;Sietsma et al. 2021). By analysing full texts, we have been able to explore the context of articles in much greater detail. We have also been able to conduct a linguistic analysis to identify polarity, i.e. whether or not the language is around mobility as an opportunity (positive words used) or something unfortunate (negative words used). We are thus able to contribute to the debate about migration as an opportunity and planned behaviour or as involuntary forced action (e.g. Tacoli 2009;Black et al. 2011). While some recent studies have investigated migration as a strategy to minimise risk with long-term welfare effects (e.g. Mueller et al. 2020;Nienkerke et al. 2023), empirical research about this polarity is still scarce (Piguet 2022).
Quantitative evidence on the link between climatic factors and mobility is rare and the existing reviews suffer from disciplinary hurdles (Thalheimer et al. 2021). For example, focusing only on climate migration can overlook the demographic literature that considers the many other facets of human mobility, such as residential and amenity mobility (Zander et al. 2022). Our study aims to fill this gap by applying a broad range of keywords and an innovative method, topic modelling, which automatically analyses lengthy texts.

Materials and methods
We applied two approaches to systematic analysis of the literature on heat, drought, and human mobility over the period 2001 to 2021 inclusive. First, we characterised the literature derived from a search process that applied an automated dictionary-based approach as well as manual coding. Second, we used unsupervised topic modelling to reveal issues and themes without bias. Using both approaches helped us to verify and interpret the machinebased results. We then combined the results of both approaches and applied basic descriptive statistics to reveal whether the characteristics differ across the revealed topics.

Systematic literature review
We searched two databases, Scopus and Web of Science (WoS), following the PRISMA flowchart and reporting standards for the inclusion and exclusion of articles (Fig. 1). Before the final search, we tested a series of keyword combinations and limitations to avoid a large number of irrelevant finds. Search terms were reviewed by all authors. We used broad search terms related to heat (heat, heatwave, extreme temperature) and drought as well as a mobility term (migrate, evacuate, displace, relocate). We limited the search to articles in English and excluded certain subjects such as engineering, physics, or mathematics. We completed the search on 12 December 2021 and included all publications from 2001 to that date. The search (see Table S1 in Supplementary Materials for the search string) yielded 1942 articles in Scopus and 1290 articles in WoS with 1010 duplicates (Fig. 1). The review comprised four stages: (1) an initial identification of candidate articles from the two databases, (2) a first screening of titles and abstracts, (3) a second screening involving reading of the full texts, and (4) a cross-reference check. The first screening was mainly done by the lead author but verified by another author who screened three complete years as well (2018,2019,2020; about a quarter of all articles in the identification stage). The screening of a proportion of the literature independently by a second researcher is recommended (e.g. Shaw et al. 2014). The results of the two screenings only differed by three articles, which gave us confidence that the screening of all articles would not differ significantly when done by another person, applying the same exclusion criterion.
The exclusion criterion for the first screening was: • Articles where a mobility term was not used in the context of people Exclusion criteria for the second phase were: • Articles about evacuation in case of fires (e.g. in buildings) not related to climate change which were often about evacuation planning or dangerous heat from such fires (e.g. health issues for fire fighters when evacuating people from fires) • Articles about mortality displacement, a measure of excess deaths from heat (these articles are not on mobility) • Articles about the need for heating after evacuations or for migrants during the migration process (not on heat as an effect of climate change) After the screening process, we did a cross-reference check with other review articles on climate change migration (Cattaneo et al. 2019;Beine and Jeusette 2021;Hoffmann et al. 2020Hoffmann et al. , 2021Thalheimer et al. 2021;Zander et al. 2022) to ensure we included all key literature, even if not found through our keyword search of the two databases. Through this process, we added 48 articles, leading to 387 articles in total.

Topic modelling and clustering
Topic modelling, a form of natural language processing, is a tool for analysing large texts, in this case a few hundred full scientific articles, which cannot easily be read and analysed objectively without a machine. Researchers often unconsciously overstate topics that interest them and that reflect their expertise while overlooking other important insights (Nunez-Mir et al. 2016). The topic model approach is an unsupervised machine learning technique which classifies articles according to their key topics and word frequencies. Articles within each topic have similar words that occur frequently together (Blei et al. 2003). The automated process can reveal hidden or latent clusters and topics not immediately obvious to the researcher (Chang et al. 2009). However, the researchers' knowledge and experience are also needed to interpret the model outcome into meaningful themes (Karami et al. 2018).
We used a LDA model with the 387 full clean texts using the package topicmodels (Grün and Hornik 2011). There are different apaches to estimated LDA parameters, such as Gibbs sampling (Griffiths and Steyvers 2004) or, as applied here, the variational method (Blei et al. 2003). In this approach, the data are structured into three levels (words, topics, and documents) and each document and word have a probability of belonging to each topic (Blei et al. 2003). The derived topics from the LDA model were further clustered by topics containing similar words using the K-means clustering algorithm based on the Euclidean distance to measure similarity. The detailed steps are provided in the Supplementary Methods in the Supplementary Materials.

Article characterisation
To characterise the articles and emerging topics and clusters, we used an automated dictionary approach. We were interested in four characteristics: the dominant hazard, agency, scope, and sentiments, as defined below: • Hazard: focus on drought versus focus on heat, which includes extreme hot temperatures and heat waves • Agency: impact-oriented and vulnerabilities implicating the forced, undesired spectrum of the agency continuum of migration versus adaptation-oriented and resilience, suggesting planned and voluntary migration • Scope: farming societies versus the whole society (no cohort of people specified but general terms such as populations, people, or persons used in texts; also not explicitly excluding farming societies) • Sentiment: negative versus positive sentiment, defined by the use of words usually associated with negative or positive sentiments, here used the measure polarity which might indicate if migration overs an opportunity when coping with drought and heat or an undesirable action (see Sect. 1). Could confirm 'agency'.
For the hazard, agency and scope we generated our own dictionaries and calculated TF-IDF values for each of these characterisations for each article. Based on the articles within each topic and cluster, we were able to define the dominant characteristics of the topic and clusters as well. Sentiments were automatically assigned based on a predefined dictionary with around 9000 words commonly associated with positive or negative sentiments (Supplementary Methods).

Temporal
Twenty years ago (2001), there were six articles on migration and drought, none on heat. The total number of articles changed little until 2009, remaining at fewer than ten per year (Fig. 2). The number of articles rose to between 12 and 16 from 2010 to 2013 and in 2014 nearly doubled to 28. The number of articles in total was highest in 2016 (45) and 2017 (41) with a slight decline since.
Until 2006, no article mentioned heat or included heat-related terms. Between 2006 and 2015, heat-related articles were in the minority with fewer than a third of articles within each year focusing on heat. However, the number of articles on heat-related migration increased to an average of 44% of articles after 2016.

Spatial
Most articles (71%; 276) have investigated a specific country, a small number considered multiple specific countries (30; 8%), while the remaining 21% of articles (81) were from a global or regional perspective. This last group mainly included reviews or policy papers with several using global or regional datasets without specifically stating a country. Across those country-specific articles, most focussed on Africa (39% of all articles; Fig. 3) with Ethiopia the most researched country (31 articles), followed by Burkina Faso (14) and Tanzania and Ghana (13 articles each). Asia has also been an important focus (28%), with India the predominant country (25 articles), followed by Bangladesh (18) and China (10). Studies on the American continent (12%) have been driven by studies from Mexico (17), the USA (13), and Brazil (7). Oceania, mainly through Australia (4%), is less well represented, and only five articles dealt with Europe, of which only two specified a country (Spain and Serbia).
The global distribution of studies including a drought term was very similar to the distribution of all studies, while studies on heat have focused on the USA (11 articles), India (8), Bangladesh (7), Mexico (7), and Australia (6) (Fig. S1 in the Supplementary Materials). The USA, Malaysia, and Spain are the only countries which contributed more heatthan drought-related migration studies to this review.

Characteristics
Drought was the dominant hazard in 71% of all articles, heat in 29%. Nine topics (8, 11, 15, 18, 26, 30, 35, 41, and 43) only contain articles on drought while two topics (14 and 21) only contain articles on heat (Fig. S2 in the Supplementary Materials). Most articles (64%) are made up of language more likely to use negative than positive words. The majority of articles (73%) are specifically concerned with farming societies and the agricultural sector while the remaining 27% do not distinguish farmers from the rest of society. Approximately the same number of articles are on impacts (51%) as on adaptation (49%). Figure 4 summarises the dominant characteristics of the whole body of literature. Where the studies were conducted had a major influence on many of the characteristics. Articles from Africa (85%), Asia (75%), and South and Central America (68%) are predominantly about farming societies, while no study in Europe and only 32% in North America and 41% in Oceania are concerned with farming societies. Studies discussing global or regional issues relating to heat or drought and migration also rarely focus on farming (23%, chi-Squared = 104.08; df = 6; p < 0.001). Likewise, the dominant hazard considered in Africa (84%), Asia (72%), and South and Central America (71%) is drought but the share is significantly lower in studies from Europe (50%), North America (32%), Oceania (47%), and for global studies (63%; chi-squared = 35.00; df = 6; p < 0.001).
Significantly fewer articles focussing on the whole of society also focus on drought (58%) than do the articles on farming societies (79%; chi-squared = 17.60; df = 1; p < 0.001). There is no significant difference in whether articles concern impact or adaptation across the two hazards or across the scope.

Topic description and clustering
We chose a LDA model with 46 topics, based on different metrics used to determine the optimal number of topics (see Supplementary Methods). The most common words within each topic are listed in Table S2 in the Supplementary Materials and depicted as word clouds in Fig. 5. K-means clustering of the 46 topics resulted in five clusters (Fig. 5). Drought is the dominant hazard in four of the five clusters while heat is the focus of only one cluster (cluster 1) ( Table 1). All six topics belonging to this cluster focus on heat (Table S2 in Supplementary Materials). Only another four topics in the other clusters focus on heat (topics 19, 21, 37, 39). Cluster 1 is also the only cluster dominantly on the whole of society and not specifically on farming. Cluster 4 equally includes topics on farming and the whole of society.
Clusters 3, 4, and 5 also resemble each other, stemming from the same branch (Fig. 5). These three clusters focus on drought and farming societies, although cluster 4 also contains many articles and topics on drought migration amongst the whole of society. They differ in the source of data, the extent of agency in decisions, the sentiment, and the emphasis on different elements in society.

Cluster 1: Temperature and heat exposure (14%)
Cluster 1 is labelled 'Temperature and heat exposure' because its sole focus is on heat. Two of the six topics within this cluster are on adaptation, discussing planned or intended and voluntary migration, and many of the studies focus on developed or emerging countries (e.g. Australia, USA, Brazil, Spain, China). Topics 14 and 40, accounting for 22% of articles within this cluster, are related to heat exposure and urban migration. The other four topics in this cluster are on modelling migration flows due to extreme temperatures and drought, with many studies modelling the impacts of multiple other natural hazards such as floods and storms. Articles within this cluster are heavily data-and model-dependent with the words 'estimate', 'estimation', 'data', 'effect', or 'model' showing up as keywords in each of the six topics.

Cluster 2: Migration flows (23%)
This cluster is the second largest and contains 13 topics, mostly describing forced migration of vulnerable communities living in developing countries. Articles within this cluster focus heavily on migration flows induced by droughts although one topic encompasses two articles (topic 21) which deals mostly with heat aspects. Like cluster 1, this cluster includes literature on data-driven migration models of rural households and populations which depend on farming for their living and which are impacted by the effects of drought.

Cluster 3: Livelihoods (21%)
The keywords in this cluster include 'adaptation', 'strategies', and 'livelihoods' amongst many words describing farming practices and drought effects ('rainfall', 'water') ( Table S2 and word cloud in Fig. 5). This suggests that the overarching issue is how farmers and their households adapt to climate change-related droughts to minimise livelihood risks and secure food supply. This cluster differs from cluster 2 in its emphasis on adaptation rather than impacts, although impacts are also discussed. Words such as 'respondents' and 'survey' indicate that literature in this cluster is based on primary data collected from farmers, households, and communities. Topic 37 stands out as the only topic across the corpus dealing with island societies, specifically abandonment of islands due to drought. The word 'migration' is less important than in the other four clusters. Instead, farm abandonment is featured in two topics (25 and 37), indicating that migration decisions made by farmers in response to drought and heat can be permanent.

Cluster 4: Social dysfunction (17%)
Articles in this cluster, the smallest with seven topics and 68 articles, are heavily data focused and discuss the impacts of multiple hazards, including drought. The target group of research within this cluster has no clear dominance and could be both, farmers and the whole of society. Articles deal with social, health, and economic impacts of droughts as well as drought-conflict issues with an emphasis on populations migrating because of droughts, environmental impacts, and natural disasters. Cluster 4 emphasises the impacts of climate change and how this leads to social distress which might trigger migration, mostly in the African context.
The terms 'displacement' and 'displaced' are keywords in this cluster, and to a lesser extent 'relocation' (topic 23, the largest topic in this cluster), mostly with reference to whole villages or communities. Topic 23 also includes literature on the political and juristic implications of displacement by climate change or environmental degradation. Topic 31 (like topic 29 in cluster 2) deals predominantly with the drought-conflictmigration nexus (but much more broadly than the drought-related Syrian conflict discussed in cluster 2).

Cluster 5: Family and community (25%)
This is the largest cluster including a quarter of all articles in 11 topics. Unlike cluster 3, which it otherwise resembles, this is the only cluster containing more topics with articles that use more positive than negative words. The focus of this cluster is planned household decisions and migration as an adaptation and coping strategy, although some topics emphasise the impacts of droughts and climate change. Articles investigate the migration process and the social impacts of migration on all involved ('migrants', 'farmers', 'families', 'communities', 'women'). Other social issues discussed in literature within this cluster include the role of social networks and remittances (topic 18); loss and sense of place (topic 2); and gender (topic 36). Twelve percent of articles within this cluster deal with livestock and pastoralists (topics 8 and 44) and how they adapt to droughts through migration. Like cluster 3, this cluster also includes terms such as 'respondents', suggesting that many articles employed social science methods.

Result of sentiment analysis
Across the whole corpus, the number of negative words ranged from 7 to 756 in an article, with a median of 135. The number of positive words ranged from 11 to 426, with a median of 107. Sixty-two percent of all articles contained more negative than positive words; 38% more positive words. There was no significant trend in the use of positive and negative words over the years (see Fig. S3 in Supplementary Materials). Studies on impacts include more negative than positive sentiments (75%) than do those on adaptation (53%; chi-squared = 21.75; df = 1; p < 0.001). Studies on drought include more negative sentiments (88%) than do studies on heat (41%; chi-squared = 4.06; df = 1; p = 0.044), and studies on farming societies more negative sentiments (87%) than studies on the whole of society (43%; chi-squared = 15.98; df = 1; p < 0.001).
In only one of the five clusters do positive sentiments tend to outnumber negative sentiments (cluster 5). Clusters 3 and 5 both include topics on adaptation. Cluster 3 revolves around migration being one of many adaptation strategies. Cluster 5 revolves around migration as the main adaptation strategy with many articles emphasising the benefits of migration in terms of financial security and risk reduction. The other difference between clusters 3 and 5 is that in cluster 5 words such as 'income', 'labour', 'work', and 'economic' are frequently used, suggesting that migration is being used as a strategy to increase income.

Discussion
Migration driven by drought and heat can potentially displace many people and alter population distributions, particularly when hazards are long-lasting (McLeman 2019; Ulus and Ellenblum 2021;Zickgraf 2021). While displaced people often return home after sudden-onset hazards like floods or wildfire (Black et al. 2011), people moving because of slow impact hazards which are increasing in frequency and intensity may not be able, or wish, to return. This is certainly the case with sea level rise (Hauer et al. 2020) but is also true of heat and drought (Gray and Mueller 2012;Bohra-Mishra et al. 2014;Mueller et al. 2014;Mastrorillo et al. 2016;Zander and Garnett 2020). The demographic impacts of slow-onset hazards are the reason why understanding their nexus with migration, and more broadly with mobility, is of such importance. Our review of 387 peer-reviewed articles contributes to this understanding by describing the trends in published research. In this way, we can characterise the state of knowledge about the topics over time and space and identify research gaps that could be important for adaptation and policy development.

Temporal and spatial trends
For the first decade of the twenty-first century, there were fewer than ten articles published per year on the impacts of drought and heat on mobility, with most focussing on drought in Africa, south Asia, and Mexico. Since then, the number of relevant publications has more than tripled, mirroring wider trends in publications concerning environmental influences on human mobility (Zander et al. 2022). Coinciding with an increase in published articles has been an increasing emphasis on heat as a driver of mobility, particularly in the USA and Australia. This finding corresponds to those of Son et al. (2019) who found that most heat studies, including those on heat and health, have focused on developed countries, including China, but that drought has taken precedence as a problem for developing countries where farming is the main livelihood (Thalheimer et al. 2021). Consistent with this, drought-related studies, which still predominate amongst the body of work considered here, have mostly related to farming societies while heat studies have often been on urban communities or the whole of society.

Sentiments and agency
More negative than positive words are used across the whole body of literature, based on an applied sentiment dictionary, and only 16 topics (35%) are based on more positive than negative language. Most of these topics are grouped in cluster 5 ('Family and community'). Moreover, there was a significant positive relationship between the impact of drought or heat and the use of negative sentiments. This suggests that most authors interpreted migration as a necessary response to drought and heat, with negative consequences for those moving, or for society as a whole, rather than mobility representing a long-term adaptation strategy that can provide opportunities. This may be because many of those affected, such as farmers, do not have the means to plan for migration in a manner that they do not consider disruptive and as having negative effects on their wellbeing. Such people may have tried to stay at a location for as long as possible but have been forced to leave their farming lands as conditions became intolerable and farming unviable (Mueller et al. 2014;Cattaneo et al. 2019). Many farmers also choose non-migration as an adaptation strategy because of place attachment related to land and social networks (Mallick et al. 2022) so if relocation should nevertheless become necessary, it is described as a negative outcome.
The sentiment polarity can be seen when comparing articles in clusters 3 and 5. In cluster 3 ('Livelihoods'), one would have expected more topics with positive sentiments, since many articles consider adaptation. However, the high proportion of negative words may also reflect the sentiment of the researchers rather than that of the people moving. In cluster 3, migration was considered amongst multiple adaptation strategies. Migration is likely to have been portrayed as forced and undesirable. The reason why cluster 5 is the only cluster with mostly positive topics might be that many of the articles are on the welfare opportunities arising from migration, including labour and seasonal migration, which can have positive effects on whole families.

Research gaps
An important function of a literature review is to identify research gaps. The unsupervised topic model applied here provides themes that have been covered between 2011 and 2021 and the combinations of broad characteristics (Table 1), and vice versa, can also uncover themes and combinations that are unrepresented, which represent research gaps. The following major research gaps are apparent.
Heat, farming, and migration. Less than a third of all articles (29%) are on heat. Only one cluster focuses on the effects of heat, and then as it effects the whole of society, not specifically on farming societies. While a few articles model the effects of heat in combination with other climate change-related hazards (e.g. Bohra-Mishra et al. 2014;Mastrorillo et al. 2016), there is little research on the direct causal relationship between increasing heat and mobility amongst farming communities. More research is needed on the impacts of heat and extreme temperatures on the whole farming system, including farmers who suffer from heat stress and associated social and economic effects (Budhathoki and Zander 2019) and not only on agricultural production. Drought and non-farming communities. While the effects of drought on farming are readily apparent, urban droughts are likely to become more frequent (Hoekstra et al. 2018;Zhang et al. 2019). In several locations around the world, cities have run out of water, relying on trucked supplies to eke out existence during dry seasons (Niasse and Varis 2020). To what extent periodic water shortages have influenced mobility emerges as a research gap based on the clustering of article topics.
Other research gaps are apparent from comparing topics and the most frequently used words with the broader literature on mobility and environmental drivers (Zander et al. 2022). Such gaps include: Compounding effects. The word 'compound' never emerged as a key word in any of the topics and the compound impacts of heat and drought is only apparent in one topic (16). Given the common co-occurrence of heat and drought in nature (Alizadeh et al. 2020;Ribeiro et al. 2020), the scarcity of studies that consider their compounding impacts on human mobility represents a clear research gap. The synergistic effects of heat and drought may also influence mobility more than either hazard on its own given their separate influences on water relations and physical well-being. An understanding of the timeframe of impacts of multiple climate hazards occurring simultaneously and their effect on migration is also poor with the rare exception of Ulus and Ellenblum (2021) (topic 16). Studies from Europe (see Fig. 3). Europe is not immune from global trends in heat and precipitation, notwithstanding its technological capacity. For example, some vital waterways (e.g. the Rhine; Huang et al. 2015) are expected to experience more frequent drought with far-reaching consequences for economies and societies that have long relied on rivers as transport corridors. Such climatic events are highly likely to influence human population flows yet we identified a single relevant study, from Spain (Fuller and Bulkeley 2013), and that investigates how migrants adapt to heat in new locations not migration because of climatic factors. The lack of studies on the current or impending impacts of drought and heat on migration in Europe is particularly surprising given the amount of climate research undertaken on the continent. Migration response to heat of both the urban and regional population of developing countries. While the urban heat island effect is widely appreciated as magnifying existing trends in climate change (Coffel and de Sherbinin 2018), how this will affect the megacities that now exist in many developing countries is an obvious lacuna of research. Most studies in developing counties are concerned only with the impacts of climate change on farming communities, reflecting the important role of farming to local economies and livelihoods. However, the influx of people into cities, and the vulnerability of such places to the effects of extreme heat, suggests that the influence of heat on mobility amongst city-dwellers should become an important subject of research. The urban heat island effect has the potential to reverse the trend of rural to urban migration. The exodus from some cities because of heat can potentially involve many more people and might be forced rather than a preferred adaptation strategy. Planned relocation or retreat. Such articles were characterised by migration and displacement as the most frequently used mobility terms in our review. In the broader climate change and disaster migration literature, such as that related to flooding (Mach et al. 2019) and sea level rise (Lawrence et al. 2020), there is reference to voluntary buyout programmes that are needed in the future once locations become unsuitable for habitation or which are likely to be affected in the future. However, there appears to be no literature considering buyouts or other incentive and compensation schemes to persuade people to move before areas in which they currently live become too dry and/or so hot that they exceed the human physiological or adaptation capacity. Given predictions that many places could become physically uninhabitable because of heat (Mora et al. 2017), research on how to manage retreat from such locations would seem imperative if such adaptation is to be planned and not forced. Heat-related amenity migration in developing countries. Related to planned movements are also those of wealthier people seeking warmer climates in which to retire. While amenity research commonly explores movements in relation to the climate of people in developed countries, there is a lack of studies on amenity-related migration occurring in developing countries with wealthier people displacing poor communities from greener, cooler climates while the poor are trapped in hot urban slums (Hayes 2015). Immobility research. Although our filters would have detected the words 'immobility' and 'trapped', neither appears as a frequently used key word in any cluster. Immobility was referred to only in relation to people's health as a reason why people feel they cannot move even though they would prefer to do so. Other drivers of immobility such as poverty and non-economic and psychosocial factors (Ayeb-Karlsson et al. 2020) may also contribute to the reasons people do not move from drought-prone and hot locations, even if it becomes unsafe and unhealthy to stay, and may warrant targeted policy responses when better understood. Island responses. Only one topic (37) was on island populations but heat and drought are likely to be particularly difficult for island communities, even where this is not confounded by sea level rise (Duvat et al. 2021). Unlike larger land masses, islands have little buffering against drought in terms of heat relief or water supplies so are likely to be less habitable. Most research on island society's migration decisions investigate sea level rise, but drought is likely to create more immediate problems for island residents.

Study limitations
Overall, the topic modelling yielded topics that were distinct and made sense, and showed research gaps, but we acknowledge unsupervised text analysis should complement and not replace detailed evaluation of literature (Westgate et al. 2015). In particular, while topic modelling can cluster articles into broad topics and themes, and can also determine the focus of articles by looking at the frequently used words within these topics, topic modelling cannot detect nuances. For example, the model was unable to determine the direction of migration or its duration. Some words such as 'return' suggest return migration, and words such as 'displacement' a forced action, but the fact that the word migration did not appear frequently in these studies suggests that they were not on climate change migration itself but on the potential for it to occur. Topic modelling also has limited capacity to detect discourse about complex concepts which requires close manual analysis of introduction, discussion, and conclusions.
However, our aim has not been to highlight examples of key literature in this area. This could have been done by citation analysis or expert evaluation. Instead, we aimed to show how a major and diverse corpus of literature in a focal area of interest can be analysed and summarised. We could not have read and analysed all of the full 387 articles relevant to the topic and would inevitably been unconsciously biased in our analysis even if we had tried to be impartial. Our approach is one that can readily be generalised given the rapid increase in published literature in many fields. The dictionary-based characterisation of articles can complement meta-analyses which often rely on information limited to abstracts, titles, and keywords. The machine learning topic modelling approach to uncover themes makes the literature review process both more comprehensive and independent of the scientist.

Conclusions
This paper has investigated how the peer-reviewed literature on mobility in relation to drought and heat has evolved over the last two decades (2001 to 2021). Using topic modelling, a dictionary-based automated text categorisation and a sentiment analysis, we were able to identify and interpret five emerging themes (clusters) amongst the 387 full texts we analysed. Although heat and drought are related, the five themes suggest that researchers tend to consider each separately with little consideration of synergistic impacts on mobility. Only one theme (14% of all articles) was predominantly about heat and extremely hot temperatures and articles within the theme focus primarily on the whole of society with many studies from developed countries. The other four clusters, on drought and farming societies, have strong links to livelihoods and food security. Research gaps emerged either from the clustering process itself (effects of heat on farming and migration, effects of drought on non-farming communities) or through an understanding of the broader mobility literature, particularly that driven by environmental change. This latter group of gaps point to opportunities for research on deliberate relocation to escape heat and drought; on migration either specifically related to heat or on heat and drought acting synergistically; on various forms of escape from hotter cities, both amenity migration to increase comfort and forced migration to escape excessive urban heat; on the various groups in society who would like to move but cannot despite heat or drought; and on various geographic areas that have so far been the subject of notably little research including islands and Europe. The review thus illustrates the current breadth of knowledge about heat, drought, and mobility but also highlights the research needs and opportunities in a field that is certain to expand in coming decades as both heat and drought intensify.