1 Introduction

Besides the shock of the human lives lost to the disease, when the Covid-19 pandemic broke out, the global public opinion was in awe by the sight of the main non-pharmaceutical intervention that governments put in place almost everywhere – lockdowns. The images of an immobile world created an unexpected dystopic landscape. The prohibition to step out of home emptied cities, highways, stations and airports. Like in a postatomic fantasy, media spread out pictures and videos of the usually most crowded venues of the planet – from Shibuya Station in Tokyo to Heathrow Airport in London – without a living soul in the middle of the day. After decades of reckless increase in the number of airline travellers, for instance, in April 2020 these were no more than 3 per cent their number a year earlier (Recchi et al., 2022). No other nonviolent event could have looked more antinomic – and thus revealing – of the nature of social life in late modernity.

In the pre-Covid-19 era, the number of international trips had been on a steady and uninterrupted rise since at least 1960 – from 69 million to almost 3 billion border crossings per year (Recchi, 2015, 2016; Recchi et al., 2019). Never in history have human beings had such an ease to move out of their usual residences – whether on daily commutes, weekend trips, holiday travel or (although not across the board) long-term migration spells.Footnote 1 This is clearly more the case for the rich, whose lifestyle is often patterned after frequent journeys. Nonetheless, the drop in the cost of travel has also facilitated the (shorter-haul and less exotic) mobility of the less privileged, at least in high-income countries (e.g. Demoli and Subtil, 2019). Spatial mobility has thus progressively become a hallmark of our age, as social theorists Zygmunt Bauman and John Urry happened to remark already by the turn of the millennium (Bauman, 1998; Urry, 2000). In the second half of the twentieth century, the absence of large-scale wars, economic growth, progress in transportation and ICT developments – that is, the major keys to globalization – paved the way to a more mobile world. While the aftermath of the Covid-19 pandemic and a possibly higher sensitivity to the climate impact of fuel-propelled mobility may styme this pre-existing trend, human mobility will hardly cease to be part and parcel of what it means to live in the twenty-first century – not the least because global inequality and climate change also spur migration (Barnett & McMichael, 2018; Milanović, 2019; Rigaud et al., 2018).

For a comprehensive take of the topic, we must acknowledge its different manifestations – first of all, in spatial terms. Human mobility is a multiscale phenomenon, spanning from the micro (local) to the meso (national) and macro (international) level. Repeated national surveys show that all of these levels saw an increase in the last decades of the twentieth century (for instance, in Germany: Zumkeller, 2009). People spend more time in mobility and cover a larger distance per time/unit than they were used to do. The second dimension that thus needs to be acknowledged is the temporal manifestation of mobility. Movements can be temporary, seasonal or longer term/permanent. The third dimension that fundamentally shapes individuals’ mobility experiences as well as legal and policy responses to it is the reason for mobility, including voluntary reasons (such as tourism, work, education or family reasons) or forced displacement (as a result of conflict or natural disasters), with or without documentation. In practice demarcations are not always clear, since reasons are often mixed and people are often motivated to move by a multiplicity of factors (Mixed Migration Centre, 2020), and intersections between migration and other forms of circular mobility are growing (Skeldon, 2018). Think, for instance, about ‘gap years’ and ‘sabbaticals’, not only in academia.

While mobility is a fundamental element of human freedom with real and perceived value for all groups affected by it, the historical intensification of mobility – and its likely spatial-temporal clustering in certain sites (for instance, global cities) or certain periods (for instance, during end-of-year festivities) – is a key concern from a policy perspective. We mention here just two issues that have come to the fore in recent years, partly in conjunction with the Covid-19 crisis – beyond the epidemiological risk that is openly associated with the movement of virus-carrying population.

First is the ‘evasiveness of remote workers’ through mobility. Since the outbreak of the pandemic, firms increasingly operate partly or completely remote, asking or allowing workers to ‘work from home’. Such a shift, which was already present in some industries like IT, may spur white collars’ relocation even at long distance – including abroad – as ‘digital nomads’. Some countries introduced so-called digital nomad visas (Bloom, 2020; Hughes, 2018) that can be used to track such mobility. Mobility in free movement zones, or if workers relocate short term on tourist visas, often goes unrecorded yet may be of increasing interest for policies related to such mobilities, including tax issues.

Second is the ‘environmental and epidemiological risks’ linked to human mobility. The bulk of human mobility is fed by fossil fuel burned for road, rail, air and maritime transportation. Almost all (95 per cent) of the world’s transportation energy comes from petroleum-based fuels, largely gasoline and diesel. Globally, transportation accounts for no more than 14 per cent of the global greenhouse emissions – less than electricity and heating (25 per cent), agriculture (24 per cent) and industry (21 per cent) (Pachauri et al., 2014). These proportions vary significantly by level of economic development though, and transportation takes the lion’s share of greenhouse gas emission in richer countries (e.g. the US: EPA, 2019). Therefore, the impact of mobility on the environment is bound to become more severe as economic development advances, unless major changes in fuel emission take place. In parallel, travel spreads diseases, and increased travel may have made the world more vulnerable to epidemics, although the intensity of long-distance mobility does not necessarily entail a stronger incidence of epidemics (Clemens & Ginn, 2020; Recchi et al., 2022).

The importance of effective policy responses to mobility has been amplified during the Covid-19 pandemic. A major puzzle is whether human mobility will change in size, scope and form in the coming years. At the macroscale, the appetite of human beings for travel does not seem shaken, but travel limitations and travellers’ biometric and health controls are likely to be enhanced as ‘new normal’ ways of restricting (even surreptitiously) access to undesirable travellers (Favell & Recchi, 2020). At the meso and micro level, it is unlikely that the economic and cultural attractiveness of cities as poles of mobility will be disrupted, although the take-off of telework – as complement or substitute to office spaces – may incentivize some sort of ‘flight to the suburbs’ (Florida et al., 2021). Clearly, the pandemic moment and its aftermath expose the importance of monitoring human mobility with adequate measurement tools to improve response capacities. At the same time, the pandemic experience has also brought more attention to the ethical components of tracking mobility among the general public.

2 Monitoring Human Mobility: Traditional and New Data

Tracking human movements in space has always been a challenge for population statistics, but new data open up new opportunities and challenges. Previous studies (European Commission, 2016; Bosco et al., 2022) offer a systematic review of the literature about measuring migration with traditional and new data sources, which we complement with up-to-date information and critical consideration from a policy-related angle. As suggested also by Taylor in this volume (Taylor, 2023), we pay particular attention to what digital data reflects and how it can be used for policy-relevant analysis, foregrounding the policy issue to be solved and what evidence is needed in support, discarding the ‘panopticon illusion’ (and danger) of making everything visible through mobility data.

2.1 Traditional Data: Pros and Cons

Traditionally, mobility has been measured with censuses, population registers, administrative sources and household surveys. Data from these sources are cleaned, edited, imputed, aggregated and used to produce official statistics, including the datasets documenting international migration flows and migrant stocks released by agencies of the United Nations.

The major advantages of traditional data sources are that they are transparent, frequently curated and stored in public databases (with varied degrees of accessibility), allowing comparability over time and across countries. However, there are some important limitations. First, they are not reliably available in many parts of the world. Estimates on in- and out-migration flows by country of origin and destination are only reported by 45 countries to the United Nations (UN DESA, 2015). Second, these aggregate statistics have a poor time and spatial resolution. Moreover, with these standard approaches, the category of internally displaced persons is often overlooked despite its policy relevance. In general, inconsistent definitions of a migrant make it difficult to compare data across different countries (Sîrbu et al., 2021). Third, and as an extension of the previous point, traditional sources typically do not capture circular, short-term, seasonal or temporary mobility (Hannam et al., 2006). Fourth, surveys that include large enough samples of people with different migratory backgrounds and socioeconomic profiles, particularly the most vulnerable, in different contexts and over time are not at all or not systematically available despite the importance to understand inequalities in terms of education, housing, employment, discrimination, well-being, access to services and protection, etc. Finally, censuses and surveys have data publishing lags of several years.Footnote 2 This is particularly problematic in a context in which migratory flows become increasingly complex and dynamic, and in emergency situations, including environmental or health crisis situations.

Both academic and nonacademic actors have tried to improve comparability and availability of traditional data sources and reconcile measurement problems, such as undercount, varying duration of stay criteria and coverage (de Beer et al., 2010; European Commission, 2016; Raymer et al., 2013). To respond to these shortcomings, and with the purpose of informing the humanitarian community and government partners, different international organizations have established data collection and dissemination mechanisms on specific aspects of human mobility, such as UNHCR’s refugee statistics, ILO’s labour migration statistics, the World bank remittances database or IOM’s data on various migrations matters including internal displacement and their ‘missing migrants’ project. These organizations are increasingly aiming to incorporate more digital, non-traditional data sources as part of their migration data strategies.

Table 23.1 Characteristics of some traditional and non-traditional data sources for the empirical study of human mobility

2.2 Non-traditional Data Usages: An Overview

The mass use of digital devices across the globe has generated large repositories of spatiotemporal ‘trace data’ (Chi et al., 2020), some of which provide new opportunities for ad hoc measurements and modelling of human mobility. While new technologies are capturing mobility rather than migration data (McAuliffe & Sawyer, 2021), some can also be used to better understand certain aspects of migration. As outlined in Table 23.1, different non-traditional data sources differ significantly in terms of the information available, the populations covered, geographical availability, the data level (individual or grouped), representativeness bias issues, sensitivity and in consequence in terms of who they reflect, the mobility events they capture (micro, meso and macro level), ethical issues and their usefulness to provide information relevant for policy purposes. In the policy sphere, categories are used to define ‘groups of people who are assumed to share particular qualities that make it reasonable to subject them to the same outcomes of policy’ (Bakewell, 2008: 436). While in relation to issues such as health or the environment, information about mobility events (how many individuals move, where, when and how) provide key information and the characteristics of who moves may be secondary, in the context of migration, analytical or administrative categories, such as ‘migrant’, ‘foreign worker’, ‘internally displaced person’ or ‘refugee’, fundamentally shape the interactions between individuals and bureaucratic organizations. As Taylor stressed in this volume (Taylor, 2023), that connection is often obscured when computational methods and new data sources are used.

2.2.1 A Review of the Usefulness of Non-traditional Data to Study Different Types of Mobility

2.2.1.1 Local and National Mobility

At micro and meso level, geotagged digital trace data from call detail records (CDR) (e.g. Song et al., 2010), GPS technology (e.g. Bachir et al., 2019; Cui et al., 2018; Huang et al., 2018) or social media data (e.g. Bao et al., 2016) can be used to study individual (Giannotti et al., 2011; González et al., 2008; Pappalardo et al., 2015; Wang et al., 2011) as well as group mobility (Hiir et al., 2019; Lulli et al., 2017; Tosi, 2017). Because of their wide coverageFootnote 3 and ad hoc availability, these data allow studying population movements in emergency situations, such as during natural disasters (Bengtsson et al., 2011) or events like the Covid-19 pandemic (e.g. Xiong et al., 2020). In other contexts, satellite data have been used to estimate the effect of extreme climate events, such as flooding, on migration (Chen et al., 2017). Compared to self-reporting on causes of migration in surveys, they offer the advantage of not being affected by subjective factors such as recall bias.

While individual characteristics, such as gender or age or motivations for mobility, that are key variables to consider for policy responses, are usually unavailable, researchers started to collect or link survey data with geotagged digital trace data to alleviate this limitation and get more information on demographic characteristics of the populations covered (Blumenstock & Fratamico, 2013). Other sources used for mobility research include Twitter (e.g. Fiorio et al., 2017; Zagheni et al., 2014), Skype (e.g. Kikas et al., 2015), LinkedIn (e.g. Li et al., 2019) or Flickr (Bojic et al., 2016) and could include any other platform that provides geotagged data of their users. Their usefulness for policy purposes fundamentally depends on how well represented the population of interest is on the specific platform.

Large platform companies like Apple or Google also possess vast repositories of human movement data that could be used to understand local mobility patterns. While these companies do not normally publish their data for research purposes, they offered ad hoc data products and visualizations of aggregated mobility of customers, including the use of travel modes (public transport, driving, walking), during the Covid-19 pandemic (Apple, 2021; Google, 2021). Notably, however, omitted information on methods and on the underlying population that is captured leads to a lack of clarity of these data and their biases, limiting their usefulness for policy purposes.

2.2.1.2 International Mobility

Social media advertising platforms can help estimate stocks and sociodemographic profiles of certain populations and facilitate non-probability sampling, since the platforms support showing ads exclusively to certain audiences. This information has also been used to target specific populations in order to invite them through paid Facebook advertisement to participate in a survey, such as Polish migrants in European countries (Pötzschke & Braun, 2017). Compared to traditional surveys, this approach offers the advantage of targeting demographic characteristics to reach a larger sample size at a global scale quickly and at lower cost (Rampazzo et al., 2021).

Böhme et al. (2020) used georeferenced online search data from Google Trends (looking for the combination of migration- and target country-related keywords as a proxy for migration intentions) in origin countries to improve the predictive power of international migration models. While there are promising examples in different areas employing Google Trends data, such as to forecast private consumption (Vosen & Schmidt, 2011), the precision and goodness of fit of such models can also rapidly change (Lazer et al., 2014).

2.2.1.3 A Special Case: Airline Mobility

A major source of big data on travel are airline reservation systems (ARS). A handful of private companies dominate this market. They handle such information omitting not only personal information but also categorical groupings about sociodemographic characteristics of passengers. One of these companies, Sabre, sells an air travel dataset that reports monthly data on the numbers of air travellers between all world airports and regular airline routes. Capitalizing on this source, in combination with the more traditional statistical reports of the United Nations World Tourism Organization, researchers have created a Global Transnational Mobility Dataset which details cross-border trips between all sovereign states worldwide from 2011 to 2016 (Recchi et al., 2019). Other studies have used Sabre data to infer types of transnational mobility (Gabrielli et al., 2019) the economic impact of reduced mobility due to Covid-19 (Iacus et al., 2020), and the global spread of the pandemic in 2020 (Recchi et al., 2022).

Potentially, similar data could be collected for other ticket reservation systems in bus, railway or sea lines, but such forms of transportations tend to be highly national or regional, rather than global, and thus there is possibly an issue of integration of different sources. At any rate, this is an evolving area of data collection that has proven fruitful for macro analyses of international flows. Its major limit is that a travel is an event, not a person, thus leaving uncharted the characteristics of human populations that experience cross-border travel, which survey research describes as mostly – albeit not exclusively – drawn from among the middle-upper classes (Demoli & Subtil, 2019).

Along these lines, Chareyron et al. (2021) used data scraped from the digital platform Tripadvisor to examine privileged mobility patterns. Other platforms for evaluating tourism consumption (accommodation, places, activities) that might be leveraged for research on this issue in certain contexts include Booking, Airbnb, Hotels.com or Weibo.

2.2.1.4 Difficulties to Infer Policy-Relevant Categories from Digital Trace Data

While digital devices trace the geolocalizations of their users, there are no standards or commonly respected methodological frameworks for how to produce estimates of policy-relevant information from granular geo-located data points (Bell et al., 2015), and the analysis of such data by data scientists without context-specific knowledge and understanding of the social phenomena underlying human mobility creates new risks (McAuliffe & Sawyer, 2021). Unlike survey data about respondents’ residential history, georeferenced digital trace data only record locations at a specific moment in time. Blondel et al. (2015) and Chi et al. (2020) introduce different estimation techniques to infer patterns of human mobility from observational geotagged data. Without further context-specific information, it is not straightforward to determine what the location of a given individual corresponds to (Fiorio et al., 2021). For this reason, how researchers choose to define features of trips for the ambiguous distinction between migration and other kinds of movements, and how they group geo-located data points together based on their temporality, greatly affects the consistency of human mobility estimates generated from digital trace data (Ahas et al., 2018; Fiorio et al., 2021). This points to the challenge of how to discern policy-relevant categories from inferred mobility patterns. A risk that is linked to this labelling process is called delinkage, which refers to the replacement of an individual identity by a ‘stereotyped identity with a categorical prescription of assumed needs’ (Zetter, 1991: 44).

In theory, the possibilities opened by the new data sources suggest revisiting some presuppositions of such labelling and categorization processes and question the labels researchers apply to people and the functions those categories fulfil to design policies that effectively cater for real and not stereotyped human needs (see Turton, 2005; Bakewell, 2008). In practice, however, there are several challenges to correctly understand and interpret the underlying meaning of data variables in different contexts. Certain data, such as those scraped from LinkedIn, may offer relatively straightforward ways to identify a ‘foreign worker’ on the platform, whereas such classification is more difficult and more sensitive to contextual changes when using CDR data. Ahas et al. (2018), for example, in their roaming dataset operationalize a ‘foreign worker’ as someone who did 1 to 52 trips to a certain country in a certain time period. Such an identification strategy would not have worked, however, during the Covid-19 pandemic, when remote work was widespread. Geolocated messages or posts are often the key variable in estimating the geo-coordinates of users, while other studies use the language used on social networks; friend or follower networks; profile pictures; names; or other textual information available (e.g. Huang et al., 2014; Kim et al., 2020) to infer users’ sociodemographic characteristics. In the case of Facebook marketing data, researchers have to rely on the categories provided by the platform, even though, as Zagheni et al. (2017) highlight, categories are not documented according to scientific research standards. This may introduce biases that are hard to disentangle from biases related to selection and non-representativeness, or other inconsistencies. ‘Naming’ mobile individuals is often based on legal definitions and should be carefully considered, particularly in a context where inaccurate estimates can cause confusion and be fuel for heavily contested public and political discourses (McAuliffe & Sawyer, 2021).

Without relevant content knowledge of migration and technology use, errors or wrong assumptions can lead to misspecification and misinterpretation (McAuliffe & Sawyer, 2021), exemplified by Pew Research’s 2019 estimates of irregular migrants in Europe (Connor & Passel, 2019). The authors wrongly included asylum seekers whose applications were being processed in the category of irregular migrants, leading to inaccurate and inflated estimates. Drawing on examples like this one, McAuliffe and Sawyer (2021) highlight that, in reality, the application of so-called new data science in the study of migration often fails to take into account the most basic understanding of the topic.

2.2.2 Limitations and Caveats in the Use of Non-traditional Data on Human Mobility

Despite the opportunities offered by non-traditional data sources, their use comes with important limitations and caveats that add to the potentials we outlined so far. In this final section, we list and discuss four of them. Importantly, the different non-traditional data sources differ significantly with regard to their properties and hence related concerns.

2.2.2.1 Proprietariness

A first, and rather mundane, problem with some of the above-mentioned data is difficult access. Some data sources require appropriate technical skills (e.g. Facebook and LinkedIn marketing API); some data can be purchased; some sources lack formalized purchasing mechanisms (e.g. mobile phone providers), and others do not share their data at all. Moreover, by employing terms of service (TOS)-compliant methods, a researcher may respect the business prerogatives of the company that created the platform studied, but this may or may not respect the dignity and privacy of the platform users (Freelon, 2018). This is particularly sensitive in a context of radical power asymmetries with the platform/service providers, as users often have far less understanding of who can access their data and under which circumstances, as well as of the functioning of the tools they use online (Broeders & Dijstelbloem, 2015; Taylor, 2023).

2.2.2.2 Non-representativeness

Second, another key and too often overlooked issue with digital trace data – like in many other social science data – is selection bias: users of a particular social media platform or mobile phone provider are not representative of the underlying general population. In the analysis of CDR, selection bias regarding mobile phone ownership and usage must be considered when extrapolating from the number of moving SIM cards to the number of moving persons (Blumenstock, 2012; Blumenstock & Fratamico, 2013). For instance, in some sub-Saharan African countries, men are more likely to be mobile phone owners, while phone sharing is common among rural women, and there is considerable cross-country variation: while mobile phone records in Kenya are an excellent proxy for mobility, regardless of socioeconomic factors, mobile phone data in Rwanda are a good proxy only for the mobility of wealthy and educated men (Luca et al., 2021). While existing studies showed that approaches using CDR data work well in one-off emergencies, such as the earthquake in Haiti (Bengtsson et al., 2011) and other disaster events (Chen et al., 2017), for estimating general population displacement, ad hoc knowledge is needed about who is using phones or services. Otherwise, such approaches cannot identify vulnerabilities of specific populations, a key aspect of targeting social protection and relief (Lu et al., 2016). Similarly, Facebook and Twitter adoption rates differ between countries and depending on user characteristics, such as age or gender (Zagheni et al., 2017). By relying on data from highly specialized online services, users’ self-selection into these services hence limits the generalization of these results (Böhme et al., 2020). For instance, LinkedIn may be useful to study the labour mobility of highly educated individuals in rich countries and allows researchers to link this to career choices and industry-specific patterns. However, it cannot yield mobility estimates for the global population. This is problematic because, as Sîrbu et al. (2021) highlight, being unable to track specific groups of users can steer migration policies in directions that unwillingly perpetuate discriminations or neglect the needs of invisible groups.

Different statistical approaches help to correct for selection bias. Zagheni and Weber (2015) propose a method that relies on calibration of the digital trace data against reliable official statistics. When the data also contains demographic information about users of a given platform, that information can be leveraged to de-bias non-representative results by adjusting the responses via multilevel regression prediction models and post-stratification (Wang et al., 2015). Importantly, statistical calibration models require datasets containing enough variables for the use of post-stratification techniques, as well as knowledge about specific functional relationship between estimates of migration and how this relationship varies by geography and population characteristics as well as how it changes over time – information that is often not available, and that requires systematic and hence costly on-the-ground research. Since the composition of the user bases of new data sources may change rapidly, predicting over time variation is usually more difficult than understanding cross-spatial variation in human mobility.

Beyond challenges that are common to any survey, such as selection bias and nonresponse, some pitfalls are specific to non-probability sampling on social media. Not only does non-probability inclusion lead to non-representative data, but the sampling error is further enhanced by the self-selection of users, which may be affected by issues of trust and incentives, and by the platform’s algorithm.

To alleviate some of these shortcomings and generate more reliable and comprehensive estimates, it is key to borrow from a number of different data sources and develop methods to analyse them that are robust to the lack of a specific data source (European Commission, 2016). For example, Huang et al. (2021) used Twitter, Google, Apple and Descartes Labs data to disentangle the disparities in mobility dynamics from lower- and upper-income US counties during Covid-19. They found that mobility from each source presented unique and even contrasting characteristics. Their (optimistic) conclusion is that hierarchical Bayesian methods can be used effectively to combine different mobility data in a consistent way. However, this requires the availability of different mobility datasets as proxies for the same phenomenon, which at the global level – given that the data showed contrasting characteristics even for the USA – seems extremely hard to achieve.

2.2.2.3 No Gold Standard

Third, it must be acknowledged that a proper gold standard does not exist since precise current and past mobility patterns are unknown. Therefore, validation of nowcasting models of human mobility is not straightforward. While traditional data sources have a number of limitations and caveats, without a benchmark it is difficult to trust new data sources and innovative approaches and assess their validity (European Commission, 2016). Therefore, a combination of traditional and new data might yield more accurate estimates and predictions than solely relying on non-representative sources (Lazer et al., 2014; Zagheni et al., 2017).

2.2.2.4 Ethical Concerns

Finally, we deem appropriate to underscore some ethical caveats and raise frequently ignored data justice-related questions, which Taylor discusses in this issue (Taylor, 2023). While statistical techniques may alleviate the shortcomings of new data sources, the use of some of this information, notably individual-level data (CDR, social media data), raises severe ethical issues. Anonymization – i.e. removing personal identifiers – is a commonly used method to protect users’ privacy, but it is not sufficient to shield privacy nor address issues related to informed consent, since in large mobility datasets, individuals can be reidentified with as little as four spatial-temporal data points, even if they do not contain identifiable information like names or email addresses (de Montjoye et al., 2013). Having a precise, always-on tracking of individuals, with a spatiotemporal history of their trajectories, and drawing a picture of how people use city space or move across borders and how they break rules and create informal ways to support themselves are a sensitive matter in any context, especially when data are unobtrusively collected without informed consent. Risks are aggravated in an environment where geolocations might be mapped to addresses such as religious places, abortion clinics and other sensitive areas. In the context of migration, where many individuals are vulnerable, and political freedoms cannot be taken for granted, these concerns are particularly important. It is key to consider what it means if mobile populations become more legible and, thereby, more amenable to control from above (Scott, 2008). Individual invisibility may sometimes be life-saving or, at least, grant a basic right to personal freedom. As Polzer and Hammond (2008) insist, ‘researchers who lift this veil [of invisibility] in the name of illuminating ‘creative livelihood strategies’ or ‘flexible identities’ may inadvertently be alerting powerful states, the UN or NGOs to the ways in which their rules are circumvented, and thereby reduce the space for life-saving creativity and flexibility in remaining invisible’. While visibility to institutions that are seen as potential allies might increase access to resources, defence of rights and legitimacy and can hence be seen as an ethical imperative, invisibility may serve as a protective shield in the absence of true legal, political and social protection, and in contexts of xenophobic and majoritarian violence. As governments and public agencies are increasingly using digital technologies for a more efficient, neutral and disembodied migration management and border control (Latonero & Kift, 2018; Trimikliniotis et al., 2015), Leurs and Smets (2018) remind us that it is important to ponder how approaches, methodologies, tools and findings may be coopted or used in unintended and undesirable ways. Actors interested in better understanding human mobility may include organizations like the United Nations and aid agencies, but also private sector subcontractors, as well as actors in the ‘migration industry of connectivity services’ (Gordano Peile, 2014), such as money transfer services, mobile phone companies targeting refugees or even illegal organizations exploiting irregular migration. In a context where Western states direct much attention and investment to monitoring and combatting irregular migration in some geographical areas (Andersson, 2016; Słomczyńska & Frankowski, 2016; Triandafyllidou & McAuliffe, 2018), journalistic coverage (BBC, 2021) of the terrifying final hours of a fatal attempt to cross the English channel exemplifies not only the centrality of technology use in life-saving efforts but also the risks digital traces pose to individuals, reflected in them tossing their phone into the waves to protect people traffickers’ identities or to hide details that may prevent their asylum claims being accepted.

The key here is to try to minimize people’s vulnerability in the face of unequal power relations. This may entail very different decisions when trying to better understand labour mobility of highly educated intra-European migrants, or analysing intra-city mobility patterns by vehicle type, rather than when dealing with marginalized and vulnerable groups. Since we know about certain minority populations’ reluctance to participate in routine demographic exercises after experiences of marginalization and stigmatization (Weitzberg, 2015), any choice to make populations visible without informed consent should be carefully considered.

3 Concluding Remarks

Based on the above-described potentials and pitfalls of different sources, we recommend that policy-makers use digital data to temper the shortcomings of traditional mobility data – namely, their poor space-time resolution, the limited availability of data disaggregated by sociodemographic characteristics, their delayed availability – when more detailed data is needed to better address policy issues related to inequalities, such as regarding housing, education, health, employment and non-discrimination. Ultimately, the usefulness of digital data and arising methodological challenges depend on research goals. For example, longitudinal mobility estimations using digital data are rendered difficult by changing user bases. No single non-traditional data source captures all types of mobility, but the different sources discussed here capture related but partly different phenomena, including urban and international transport, tourism, population displacements, labour mobility of the highly educated and large-scale mobility data from different data providers, where usability depends on coverage and accessibility. Because of their higher granularity, digital data can monitor and evaluate human mobility and population presence at a higher scale, resolution and detail – in real time – spanning from the micro (local) to the meso (national) and macro (international) level. This is particularly important for policy responses in emergency situations (such as humanitarian or public health emergencies). Here, estimations based on digital data sources can help make faster and more informed decisions. However, in other contexts, it should be carefully considered who benefits if individuals on the move and their practices are made visible.

For models to be correctly specified and for estimations to be reliable, in-depth context-specific knowledge about the ways human mobility occurs on the ground as well as knowledge about quickly changing technology use among the populations of interest is fundamental, although difficult and costly to obtain, and has to be constantly updated. Traditional statistics remain important to evaluate and complement estimations as baselines, especially in a context in which migration is the focus of significant political and media attention and is all too frequently misunderstood or misinterpreted. Moreover, as regards migration, it is crucial that legal policy definitions and normative frameworks are respected. The bottom line is that any model of human movement should be carefully tailored to the specific local context. In an unpredictable world, where people move or not move for myriad reasons and these reasons may vary quickly, as the case of Covid-19 exemplifies, we encourage researchers to constantly reassess if what is being measured reflects the social phenomenon that the measurement is intended to assess and to ensure that their analysis does not generate injustice by rendering people visible in ways that are damaging to their rights and freedoms. This makes the data collection and analysis process more expensive and less universal than sometimes suggested in relation to new data sources and their usefulness for policy-relevant analysis.

Beyond the realm of migration, digital data on human mobility can assist evidence-based policies on transportation – a primary concern in the field of environmental policies. A well-informed understanding of human mobility and its forms is particularly urgent in a context in which a reduction of fossil fuel-propelled transports in rich countries (flights, cruises and car use) is needed to mitigate global warming (Holden et al., 2019; Peeters & Dubois, 2010). A primary instance is the design of incentives to shift passengers to less polluting travel means – e.g. from airplanes to trains.

Whatever the domain of interest, researchers must be aware that precise, always-on data about individuals, often unobtrusively collected without informed consent, raise several issues concerning privacy and security. Associated risks largely depend on the legal, political and social situations of the individuals or groups eventually covered by it, the actors handling this data and their interests. Operational guidelines on data responsibility are provided, for example, by the Inter-Agency Standing Committee (IASC) of the United Nations system.Footnote 4 Although far from frontline operations, migration research analysing big data can have a quick impact on policies (for instance, border management) and, thus, on human lives – something that traditional studies of migrants rarely had (McAuliffe & Sawyer, 2021). Just because research and policy-making on human mobility have an unprecedented potential to go hand in hand – a good news in itself – we can only urge any actor collecting, using, storing and sharing human mobility data to commit to ‘do no harm while maximizing the benefits’ principles (IASC, 2021), always prioritizing the safe, ethical and effective management of personal and nonpersonal data.