1 Introduction

At present, China’s urbanization has entered a new stage in which central cities drive urban clusters and regional economic development. Urban clusters are not only an important vector for urbanization, but also key locales for the gathering of mobile populations. The ongoing gathering of migrant populations in large cities and urban clusters highlights changes in regional economic development, and is reshaping China’s social and economic development patterns. Therefore, the study of population migration in urban clusters is of great importance for macroeconomic decision-making in China.

Population migration and flow have always been the hotspots, and scholars have carried out a series of studies around them, including the personal characteristics and the labor market performance of the floating population (Fan, 2002; Long et al., 2020; Shen & Huang, 2003), types of population migration (Sun & Fan, 2011; Shen & Liu, 2016), the causes and consequent impacts of population migration (He et al., 2019; Li et al., 2020a, b, c; Ma et al., 2019; You et al., 2018), the related policies(Chan & Buckingham, 2008; Lu et al., 2019), and the pattern of migration (Fan et al., 2020). Among them, the data involved in the study are mainly national census data, micro survey data and big data. In the traditional population migration studies, the data is mainly obtained from national census and micro-survey data (Ke et al., 2022). For instance, Zhu (2007) explored the influencing factors of the migrants’willingness to settle based on the survey data of the floating population in the coastal areas of Fujian Province; Dong et al. (2014) adopted the national census data to study the network structure characteristics and models of China’s inter-provincial population migration, and points out that China’s inter-provincial population migration network presents a clear rightward trend in the evolution process. However, the data from national census data or micro survey data is usually inadequate in terms of timeliness, directionality, and continuity. Moreover, traditional analyses focus mainly on inter-provincial migration, not on inter-city migration (Dong et al., 2014; Liu & Feng, 2014). The broad scope of traditional analyses limits their efficacy in guiding policy-making because such analyses hardly reflect the actual population migration situation.

With the rapid development of internet platforms in recent years, based on big data from social networks, real-time location data for population movements, and total mobile communications data, scholars have conducted extensive studies on the changing, characteristics and patterns of intercity population migration (Luo et al., 2020; Peng et al., 2021; Petzold, 2020). For instance, Blumenstock (2012) dug deeper into the more subtle population migration patterns based the big data of mobile phone location information of 1.5 million Rwandans within 4 years. Thomas et al. (2014) validated the applicability and scientificalness of population migration analysis using unweighted commercial microdata by analyzing studies related to population migration in the UK. Neal (2014) compared and analyzed the characteristics and differences of population migration networks composed of different groups and different time periods based on American air passenger flow data.

At the same time, here is a growing literature discussing the use of big data to analyze the main patterns and characteristics of China’s population migration. Jiang and Wang (2017) made use of the such data to construct a network model of intercity daily population movements in China using complexity network analysis tools, and they measured and analyzed the characteristics of the complex structure of the network. Wang et al. (2017) examined the search behavior data of internet users and used social network analysis to explore the propensity paths and spatial differences of intercity population flows in the Pearl River Delta urban cluster. Zhao et al. (2018) studied the complex network of population migration in the pan-Yangtze River Delta during China’s Spring Festival by combining national-level dynamic monitoring data for mobile populations with Baidu’s migration data. Lai and Pan (2019) explored the features and spatial patterns of intercity population movements before, during, and after the Spring Festival using Tencent migration big data.

Big data consisting of geographic information, social media, and data from information and communication technologies (Chow et al., 2018; Fang et al., 2015) contained rich spatial–temporal behavior information, and effectively made up for the lack of refinement and timeliness of traditional census and sampling data (Li et al., 2016; Wu et al., 2016). And this has provided support for developing fine-grained, accurate information on population migration and made intercity population migration research a reality. However, although the use of location data can simulate population migration by combining statistical data and questionnaire interview data, the data obtained include transitory population movements such as short-term travel and visits, which are obviously different from one-way population migration accompanied by household migration and resettlement. Although the use of research focused on short-time intercity migration has some reference value, its role in guiding policy formulation concerning how to target socioeconomic management in inflow and outflow areas needs to be further examined. In addition, with respect to research content, population migration studies based on big data aim mainly at examining the population migration within urban clusters or the network structure of cities (Li et al., 2020a, b, c), while there are few discussions of the direction of population movements and future development trends (Chen et al., 2020). Thus, after excluding short-term population movements, we must consider how intercity migration develops and how migration flows within urban clusters will be shaped in the future. Examination of these issues must be a priority to promote the high-quality development of urban clusters and improve the social and economic development of cities.

Internet search engines have become a principal source of information, and the analysis of the network behavior of large samples of internet users can provide a good understanding of popular demand (Li et al., 2020c; Xu & Gao, 2017). For example, before people migrate, they are likely to use an internet search to query relevant information about the target city they are considering, and assess the risks and costs of migration based on the information they collect. Therefore, information searches reflect people's psychological migration propensity. In fact, psychological migration propensity and real-world migration activities always affect and interact with each other. Psychological migration propensity can often be used as guides to future migration flows, and this guiding effect can strengthen migration tendencies. Therefore, using the network big data generated by information from search results between cities over long periods of time, we can calculate the propensity of inter-city migration under the background of the new economic normal. Additionally, using network big data to analyze the propensity to migrate not only excludes consideration of short-term population movements, but also allows discussions of intercity population migration propensity and predictions of population migration.

This study is innovative in several respects. Firstly, based on network big data, the paper focuses on an analysis of intercity population migration direction, addressing a problem with much inter-provincial data that fail to reflect population migration patterns in a comprehensive and detailed manner. Secondly, this study supplements existing literature in the areas of migration flow and spatial pattern analysis of population migration. In addition, this research tries to predict and analyze population migration to provide policy formulation guidance that considers future population migration patterns and promotes the development of urban clusters.

2 Data sources and research methods

2.1 Data sources

The primary data source for this paper was Baidu’s index trend data for the active search behavior of internet users.Footnote 1 The Baidu search index is obtained by calculating the weighted sum of the search frequency for each keyword that appears in web searches; the index reflects the attention web users pay to areas of interest (Jiang et al., 2015; Yu & Zhang, 2012; Xu & Gao, 2017). Since each user’s retrieval behavior in Baidu is a measure of active willingness, each retrieval related to migration behavior may be seen as an expression of a user’s willingness to migrate. According to a study of the use of search engines by Chinese internet users carried out by the China Internet Network Information Center in 2019, China had 695 million search engine users as of June, 2019, while the national search engine utilization rate stood at 81.3%, and Baidu Search’s penetration rate among search engine users was 90.9%. It can be seen that the data generated by people using network searches can reflect their psychological demands (Gu et al., 2015). Therefore, an analysis of the population migration propensity based on Baidu Index big data which reflects the active search behavior of network users is scientific and credible.

To meet the goals and requirements of this research, search behavior data from 317 prefecture cities in China for the 24-month period, January 2018 to December 2019, were collected. Based on user demand for general information about population flow orientation and the Law of Least Effort (Wang et al., 2017), and taking into consideration data availability and scientific requirements, 6 matrix of 317 × 317 in two periods are formed using “city j + recruitment”, “city j + map” and “city j + rent”Footnote 2 as keywords for 317 regions in China. This paper first uses the search index matrix that is obtained to construct a population migration propensity index, then analyzes the population migration propensity of cities in key urban clusters, and finally predicts the probability of population migration propensity of cities in the urban clusters in the future, so as to assess the present situation and future development of inter-city population migration within urban clusters.

2.2 Population migration propensity model

Population migration propensity refers to the probability that people with certain motives to migrate will move to a locale they have chosen. Since population migration propensity is interrelated with actual migration, studies of the propensity to migrate can assess and predict migration behavior to a certain extent. According to user demand for general information about population flow orientation, potential migrants pay considerable attention to employment opportunities, housing market trends, and the spatial structure and infrastructure of their chosen destinations.

Given the concerns of potential migrants, the migration propensity derived from city “i” to city “j” is assessed mainly in terms of employment opportunities, housing market and the spatial structure and infrastructure of the target city. The migration propensity derived from city “i” to city “j” is modeled as:

$$\Pr opensity_{ij} = \sqrt[3]{{Job_{ij} \times Map_{ij} \times House_{ij} }}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} (i \ne j)$$
(1)

where i is the original city, and j is the objective city. \(Propensity_{ij}\) indicates the migration propensity from city “i” to city “j”; \(Job_{ij}\), \(Map_{ij}\) and \(House_{ij}\) represent the search keywords combined with the name of objective cities in city “i”, which are “city j + recruitment”, “city j + map” and “city j + rent”. And \(Job_{ij}\), \(Map_{ij}\) and \(House_{ij}\) indicate respectively the searcher’s concern with employment opportunities, spatial structure and infrastructure, and the housing market of the target city, i.e., employment propensity, orientation propensity and settlement propensity.

In order to explore the direction of intercity migration propensity, the difference of the migration propensity between city i and city j, which is called the net migration propensity (\(M_{ij}\)), is measured. When \(M_{ij} > 0\), it means that the migration propensity derived from city “i” to city “j” is greater than the propensity derived from city “j” to city “i”. City “j” is the potential destination for migration from city “i”. The larger the value of \(M_{ij}\), the stronger migration propensity derived from city “i” to city “j”, and the smaller the value, the weaker the propensity.

$$M_{ij} = \Pr opensity_{ij} - \Pr opensity_{ji}$$
(2)

In this equation, \(M_{ij}\) represents the net migration propensity derived from city “i” to city “j”, and \(M_{ij} = - M_{ji}\).

2.3 Probability of population migration propensity

The probability of population migration propensity is the likelihood that a population’s propensity to migrate in geographic space will change during a certain period. This probability is expressed by the ratio of population migration propensity of a city during a certain period to the sum of migration propensity of all cities in the same period. The immigration propensity probability is expressed by the ratio of the immigration propensity of a city to the total immigration propensity of all cities in the given period. In the same way, the emigration propensity probability can be calculated. The net migration propensity probability is the difference between the immigration and emigration propensity probabilities.

When a city’s net migration propensity probability is positive, it means that residents of other cities are more likely to move to the city, making it a potential population in-migration city. Conversely, a negative net migration propensity probability indicates that residents of a city are more likely to move out of the city, making it a potential population out-migration city. It is important to note that the positive and negative values of net migration propensity probability do not reflect actual population movements; rather, they measure the potential propensity to move in and out, which can reflect the possible growth rate of the mobile population of a city. When the net migration propensity probability becomes negative, it indicates that the city is a potential outflow city. This does not necessarily mean that people are moving out of the city, but it does indicate that the growth rate of mobile population in the city is slowing. When the net migration propensity probability is positive, it indicates that the city is a potential in-migration city, and that its mobile population is growing. The net migration propensity probability can help to identify the migration propensity type of a city and indirectly reflects the growth rate of a city’s mobile population, making this indicator of particular importance.

2.4 Prediction model of population migration propensity probability based on Markov Chain

In this research, Markov chains are used to predict the probability of intercity population migration propensity and determine the possibility of future population migration. A Markov chain is a form of Markov process, which belongs to the stochastic process theories. Markov process theory describes a process of reaching a certain state in a certain period, and the process depends only on the current state while being independent of previous states. Because the process can describe not only changes to a time series, but also changes to structure, this theory is widely used in the analysis of population migration flow (Cai et al., 2007; Long et al., 2018). Based on the Markov chains, the transition probability matrix and the initial probability matrix of population migration propensity are calculated, and the probability matrix of migration propensity in subsequent years is predicted.

2.5 Scope of this Study

Three geographical areas are considered in this study: the Yangtze River Delta urban cluster, the Pearl River Delta urban cluster, and Beijing-Tianjin-Hebei urban cluster. These three enormous urban clusters lead the economic development in the east of China and play a key role in the entire country’s development. The Yangtze River Delta urban cluster includes 26 cities: Shanghai, Nanjing, Wuxi, Changzhou, Suzhou, Nantong, Yancheng, Yangzhou, Zhenjiang, Taizhou, Hangzhou, Ningbo, Jiaxing, Huzhou, Shaoxing, Jinhua, Zhoushan, Taizhou, Hefei, Wuhu, Maanshan, Tongling, Anqing, Chuzhou, Chizhou and Xuancheng. The Beijing-Tianjin-Hebei urban cluster includes 13 cities: Beijing, Tianjin, Zhangjiakou, Chengde, Qinhuangdao, Tangshan, Cangzhou, Hengshui, Langfang, Baoding, Shijiazhuang, Xingtai and Handan. The Pearl River Delta urban cluster has 9 cities: Guangzhou, Shenzhen, Zhuhai, Foshan, Huizhou, Dongguan, Zhongshan, Jiangmen and Zhaoqing.

3 Major findings

3.1 Correlation between population migration scale and migration propensity

Populations with a potential to migrate will inquire about a destination’s orientation, employment opportunities and housing market. There is a certain correlation between the intensity of migration propensity that is calculated using the Baidu index and the scale of population migration. To verify this relationship, this study used data on migration flow from China’s 1% sample survey of 2015. The results show that the correlation coefficient between the scale of population migration and migration propensity in urban clusters is above 0.9. Moreover, after testing for significance, it can be concluded that the method of measuring the intensity of migration propensity through use of the intercity query function on the Internet can characterize intercity migration to some extent (see Table 1).

3.2 Comparison of the migration propensity probability of three major urban clusters

China’s intercity population migration propensity matrix was calculated using a migration-related Baidu search index. Given national population migration flows, the intercity population migration propensities of three major urban clusters were calculated by using Eq. (1). Predictions of the in-migration or out-migration propensity attributes of a city were made based on the net migration propensity probability of each city in 2019; these are shown in Fig. 1 and Table 2.

Fig. 1
figure 1

Net Migration Types of Three Major Urban Clusters in 2019

Table 1 Correlation analysis of population migration scale and propensity.
Table 2 Top 10 ranking of net migration propensity probability of three major urban clusters

Figure 1 shows that the potential migration propensity of the cities in the Yangtze River Delta, the Pearl River Delta, and Beijing-Tianjin-Hebei urban clusters varies widely. Based on their net migration propensity probability, this study classified cities within the urban clusters into three categories: cities with in-migration propensity, out-migration propensity, and relatively balanced propensity. Of the 26 cities in the Yangtze River Delta urban cluster, Fig. 1 shows that cities with potential in-migration propensity include Shanghai, Nanjing, Zhoushan and Hangzhou; the cities with relatively balanced migration propensity are Maanshan, Tongling, Wuxi, Wuhu, Chizhou, Changzhou, Yangzhou, Jiaxing, Anqing, Huzhou and Nantong; and cities with potential out-migration propensity include Shaoxing, Chuzhou, Xuancheng, Ningbo, Zhenjiang, Taizhou, Yancheng, Taizhou, Jinhua, Suzhou and Hefei. Although more cities in the Yangtze River Delta urban cluster have potential out-migration propensity than potential in-migration propensity, the great attractiveness of Shanghai and Nanjing makes the Yangtze River Delta urban cluster a strong attraction.

Calculation of the net migration propensity probability among the 9 cities in the Pearl River Delta urban cluster shows that Zhuhai, Shenzhen, Huizhou, and Guangzhou have in-migration propensity; Zhaoqing and Zhongshan have relatively balanced propensity; and Jiangmen, Dongguan and Foshan have out-migration propensity. Owing to their rapid economic development, urban inclusiveness and strong talent attraction policies, Zhuhai and Shenzhen have long been cities with strong in-migration propensity in the Pearl River Delta urban cluster. Guangzhou, an old first-tier provincial capital city, is under pressure to control its population size, just as Beijing and Shanghai are. Furthermore, because the development of other cities in the Pearl River Delta has the effect of diverting some migrants away from Guangzhou, the net migration probability of Guangzhou remains at a relatively low level. Dongguan, a manufacturing base, has encountered bottlenecks during its industrial transformation and its economic growth has slowed down significantly, making it a place with potential out-migration propensity and a slower growth rate of its mobile population.

Only a small proportion in the cities of the Beijing-Tianjin-Hebei urban cluster have in-migration propensity. Of the 13 cities in this urban cluster, Tianjin, Qinhuangdao and Hengshui have in-migration propensity; Zhangjiakou, Chengde and Cangzhou have relatively balanced migration propensity; and Xingtai, Langfang, Shijiazhuang, Handan, Baoding, Beijing, and Tangshan have out-migration propensity. To meet policy goals calling for the phasing out of non-capital core functions and support the coordinated development of the Beijing-Tianjin-Hebei region, the city of Beijing has been engaged in industrial deconstruction and population control in recent years. Beijing’s activities have given nearby cities such as Tianjin, Qinhuangdao, and Hengshui more opportunities to develop, making these cities attractive places with in-migration propensity. Moreover, the development of these cities is helping inflow population to be more widely distributed, allowing Beijing to become a city with out-migration propensity that will continue to see a slower growth rate of its mobile population in the future. In 2010, the growth rate of Beijing’s mobile population was 14.735%; it had decreased to – 3.739% in 2018.

3.3 Comparison of migration propensity paths of three major urban clusters

The general direction of migration propensity of cities within urban clusters can be determined after obtaining the absolute value of the net population migration propensity between each pair of cities. It can be seen in Table 3 that both the Yangtze River Delta urban cluster and the Pearl River Delta urban cluster are apt to receive migrants from external cities, while the population migration of Beijing-Tianjin-Hebei urban cluster consists of migration to cities outside of the urban cluster and movement between cities within the urban cluster. This indicates that both the Yangtze River Delta and Pearl River Delta urban clusters are potentially more attractive to migrant populations than the Beijing-Tianjin-Hebei urban cluster. Among the three clusters, population inflow from areas external to the Pearl River Delta urban cluster exceeds population outflow from the urban cluster to external areas. Population outflow from the Yangtze River Delta urban cluster to external areas exceeds population inflow from areas external to this urban cluster. Therefore, overall, the Pearl River Delta urban cluster is potentially the most attractive urban cluster, and this is inseparable from its place in the overall scheme of economic development of China.

Table 3 Top 10 paths of net migration propensity of three major urban clusters
Table 4 Population migration propensity ranking of cities in the Yangtze River Delta Urban Cluster
Table 5 Population migration propensity ranking of cities in the Pearl River Delta Urban Cluster
Table 6 Population migration propensity ranking of cities in the Beijing-Tianjin-Hebei Urban Cluster

Among its top 10 migration propensity flows, the Yangtze River Delta urban cluster has seven internal migration paths, and three migration paths that are between the urban cluster and cities external to the Yangtze River Delta area. Among these, Shanghai is the destination for three migration paths. This indicates that despite policies for megacities that call for regulation of population growth and support for industrial upgrades and transfers, Shanghai, as the center of China’s economic development, is still attractive and remains the first choice of destination for people in many provincial capitals and other cities of the Yangtze River Delta. Furthermore, net inflows to Shanghai are expected to continue to rise. At the same time, nearly 50% of the top ten migration paths of the Yangtze River Delta urban cluster show population movement from Hefei to other cities. The Hefei Comprehensive National Science Center plan was approved by the Chinese government in 2017, and Hefei became a designated national science and technology city, along with Shanghai. This designation brings with it many opportunities for Hefei, making its potential to attract migrants self-evident. Hefei may show a net migration propensity because the emigration propensity of its population is greater. Although the number of cities in the Yangtze River Delta area with population outflow potential is greater than the number of cities with inflow potential, the enormous attraction of Shanghai and Nanjing makes the Yangtze River Delta urban cluster attractive overall.

The Pearl River Delta urban cluster has six internal migration paths, indicating that this region has migration propensity that is both balanced and relatively active. Zhuhai has become key target city for migration, both from cities within the urban cluster and from external cities. Guangzhou, the capital city of Guangdong province and one of the key cities of the Pearl River Delta urban cluster, its propensity to migrate to Zhuhai is greater than that of Zhuhai to migrate to it. This is largely because of pressure to control the size of Guangzhou’s population, and in part because the rapid development of Zhuhai has diverted people away from Guangzhou.

There are six internal migration paths in the Beijing-Tianjin-Hebei urban cluster, but most of these are internal migration flows from cities within Hebei province and from Tangshan to other cities within Hebei province. This indicates that compared with places such as Qinhuangdao, Zhangjiakou, Langfang, Cangzhou and Hengshui in Hebei Province, Tangshan is at a disadvantage attracting migrant population, because its economic development process is very dependent on secondary industries, and the city lacks economic vitality and has serious unemployment problems. The external population migration paths of the Beijing-Tianjin-Hebei urban cluster all originate in Beijing, making it evident that Beijing’s people have a strong out-migration propensity. At the same time, this fact also shows that the correlation between different migration propensities is not high, despite the physical proximity between Beijing and its neighboring cities.

Every city is both a destination and a place of origin. Tables 4, 5 and 6 allow us to identify the top three origin cities with out-migration propensity and top three destination cities with in-migration propensity for every city in each of the three urban clusters examined in this study. In terms of the migration propensity for cities in the Yangtze River Delta urban cluster, Shanghai, Hangzhou, Suzhou, and Nanjing have both active in-migration and out-migration propensities, and the migration propensity of Hefei, Ningbo, and Jiaxing are followed. Regarding population migration propensity between cities within the Yangtze River Delta urban cluster, population migration propensity moving from Shanghai to Hangzhou, Suzhou, Nanjing, and Hefei is much greater than the propensity of population migration from Shanghai to other cities. In the future, Shanghai’s floating population will come mainly from Hangzhou, Suzhou, Nanjing, Hefei, and Ningbo. Regarding the population migration propensity between the Yangtze River Delta urban cluster and external cities, the migration propensity moving from Beijing to Shanghai, Hangzhou, Suzhou, Nanjing, and other cities in the cluster is the key aspect of the migration propensity of external cities and the Yangtze River Delta urban cluster. In the Pearl River Delta urban cluster, Guangzhou and Shenzhen are the most active cities in both types of migration propensity. They are followed by Dongguan and Foshan. It is worth noting that, except for migration from Beijing to Shenzhen, the migration paths with highly active propensity are all between cities within the Pearl River Delta urban cluster. Population migration flows in the Pearl River Delta urban cluster mainly involve Guangzhou, Shenzhen, and Zhuhai. In comparison, excluding Beijing and Tianjin, only a small number of in the Beijing-Tianjin-Hebei urban cluster have active migration propensity. The migration propensity between cities within this urban cluster and external cities is concentrated in economically developed cities and is relatively weak. Beijing is the both the main potential destination city and potential city of origin for other cities in the Beijing-Tianjin-Hebei urban cluster. However, Beijing’s high level of emigration propensity weaken the net migration propensity between other cities and Beijing.

3.4 Prediction and comparison of migration propensity in major urban clusters

Markov chains are used to calculate the net migration propensity probability from 2020 to 2030 for each city in the Yangtze River Delta, the Beijing-Tianjin-Hebei area, and the Pearl River Delta urban cluster; the results are shown in Fig. 2. As a result of policies to regulate population in megacities and China’s strategy of promoting urban clusters to achieve high-quality development, Beijing will remain a city with net out-migration propensity in the future, and the growth rate of its mobile population will slow down significantly. At the same time, Zhuhai, Guangzhou, Shenzhen, and other cities in the Pearl River Delta urban cluster will be major migration destinations in the future, and their mobile populations will grow rapidly.

Fig. 2
figure 2

Migration Propensity Probability of Each City in Three Major Urban Clusters from 2020 to 2030

Moving forward to 2030, Shanghai will remain as a city with strong in-migration propensity in the Yangtze River Delta urban cluster, but this propensity will slow over time, as will its population growth. In 2019 Hangzhou was a city with weak in-migration propensity; however, in the decade that began in 2020, Hangzhou will become a city featuring out-migration propensity, which can be attributed to the great attractiveness of Shanghai. Also of note, Hangzhou’s e-commerce industry is poised to grow in the future, and this industry tends to attract migration inflows. Taken together, the attraction of Shanghai and the e-commerce industry are gradually flattening the range of Hangzhou’s net migration probability, which stands around -0.005. In recent years, Zhoushan has attracted people’s attention with construction of the Zhoushan New District and its good environmental quality, and Zhoushan will have significant in-migration propensity in the future. In the Yangtze River Delta urban cluster, although the in-migration propensity probability of Shanghai has begun to decline, it remains higher than that of other cities in the cluster. The decline is not only the result of the movement of industrial facilities out of Shanghai and policies to regulate population, but also a product of the growing development of other cities in the Yangtze River Delta urban cluster due to a national strategy to promote integrated development of this region. Nanjing and Zhoushan and other cities will become the next group of cities in the Yangtze River Delta urban cluster full of opportunities with strong migration propensity and growing mobile population flows in the future. In addition, the Yangtze River Delta urban cluster will remain at the “hard core” of China’s economic development and will be the principal destination for population migration in the future.

Results of predictions for net migration propensity probability in the Pearl River Delta urban cluster show that the in-migration propensities of Zhuhai, Shenzhen, Guangzhou, Huizhou and Zhongshan will increase significantly from 2020 to 2030, but these will eventually slow and stabilize at fixed values. In these cities, in-migration propensity is much higher than out-migration propensity. Shenzhen will become the most attractive city in the Pearl River Delta urban cluster in the future, followed by Guangzhou and Zhuhai. With their potential to attract migrants, the future growth rates of these multicultural cities promise to be substantial. Because highly industrialized Shenzhen and Dongguan have little remaining capacity, Huizhou, with its unique geographical location and good resources, will become a key city with growth potential for mobile population in the Pearl River Delta urban cluster. Zhuhai, Zhongshan, and Jiangmen will develop a joint metropolitan area on the west bank of the Pearl River estuary, and this will not only divert some population from Zhuhai, but also give Zhongshan and Jiangmen more chances to develop. Zhongshan will become a city with in-migration propensity, and is poised to become a major in-migration local in the Pearl River Delta urban cluster. Jiangmen is currently a city with out-migration propensity, but the level of out-migration propensity will decline in the future. Zhaoqing and Dongguan will both maintain a balance between in-migration and out-migration propensity, and their population growth rates will remain relatively stable in the future. It is important to note that cities with in-migration propensity will be compelled to deal with the increasing size of their migrant populations in the future and will be under pressure to provide adequate health care, education, social security, and other public service.

In the Beijing-Tianjin-Hebei urban cluster, Beijing is under the pressure to regulate its population and reduce congestion and pollution caused by industrial facilities. Beijing’s response to these pressures has resulted a significant decline in Beijing’s in-migration propensity, and the city’s in-migration propensity probability is now less than its out-migration propensity probability. As a result, Beijing’s mobile population growth rate will not increase in the future, and the size of the mobile population will remain stable. Furthermore, as Beijing continues to implement measures to regulate population growth, problems related to the household registration (hukou) system have become increasingly complex. People are thus less inclined to move to this region, and most cities of this urban cluster have net out-migration propensity. Tianjin is the only city in the Beijing-Tianjin-Hebei urban cluster with net in-migration propensity and a mobile population that will grow in the future. In fact, because of Beijing’s unique advantages, its in-migration propensity is still higher than that of other cities in the China, but due to the implementation of policies manage population growth, Beijing’s out-migration propensity is greater than its in-migration propensity, and this obscures the strong propensity to move to Beijing. As shown in Fig. 2, the out-migration propensity of Cangzhou, Xingtai, Langfang, Handan, Baoding, Shijiazhuang, and Tangshan will be sharply higher than their in-migration propensity in the next 10 years, and this will result in declines in the proportion of mobile population to total population in these cities. The in-migration and out-migration propensity of Qinhuangdao, Zhangjiakou, Hengshui and Chengde will remain balanced, indicating that the mobile population growth rate of these cities will not change significantly in the future. In addition, the capacity of Beijing-Tianjin-Hebei urban cluster to bear increased mobile population will remain inferior to that of the Yangtze River Delta and the Pearl River Delta urban clusters.

4 Conclusion and discussion

The development of the Internet and big data have made it possible to study population migration and flow between cities. After constructing population migration propensity indicators based on data about the search behavior of internet users, this study analyzed the probability and direction of population migration propensity for China's three major urban clusters and predicted the probability of the population migration propensity, using Markov chains to assess the intercity population migration trends of urban clusters in the future. Our conclusions are as follows:

  1. 1.

    Currently, although the Yangtze River Delta, the Pearl River Delta and Beijing-Tianjin-Hebei urban clusters are major migration destinations, population migration propensity varies widely among the individual cities that make up these clusters. For numerous reasons, including efforts to regulate population growth, policies to upgrade and relocate industrial facilities in megacities, plans promoting integration of the Yangtze River Delta area, and the continuing development of the Pearl River Delta and Beijing-Tianjin-Hebei areas, Beijing has become a city with net migration propensity while Shanghai’s in-migration propensity is declining and the increase of Guangzhou’s population migration propensity has slowed. At the same time, cities near these three megacities have gained development opportunities, and some of these cities are experiencing higher in-migration propensity owing to significant achievements in the integration and coordinated development of urban clusters.

  2. 2.

    To promote the coordinated development of the Beijing-Tianjin-Hebei region, the integrated development of the Yangtze River Delta and the continuing development of the Pearl River Delta, cities within the urban clusters will be the main destinations of future migration. Moving forward, Shanghai, Nanjing and Zhoushan in the Yangtze River Delta urban cluster, Zhuhai, Shenzhen, Guangzhou, Huizhou and Zhongshan in the Pearl River Delta urban cluster, and Tianjin in the Beijing-Tianjin-Hebei urban cluster will become major destinations of China’s population migration.

  3. 3.

    The active in-migration and out-migration propensities of the Pearl River Delta and Yangtze River Delta urban clusters show these urban clusters will continue to be areas with active population migration. Migration propensity in the Beijing-Tianjin-Hebei urban cluster is mainly a result of the flow between Beijing and other cities, and this propensity is much lower than those of Yangtze River Delta and the Pearl River Delta urban clusters. This suggests that although Beijing and other cities in the Beijing-Tianjin-Hebei urban cluster are physically closer than cities in the other two urban clusters, their migration propensities are not highly correlated.

Using Internet big data, this study revealed the probability of intercity population migration flow in China by analyzing the migration propensity of populations in urban clusters. This analysis facilitates understanding of the scale and trends of population movements in cities and supports the formulation of policies promoting the high-quality development of urban clusters. It should be noted that the enormous volume of Internet big data has provided an opportunity for this paper to simulate population migration flow between cities. However, because this study is based on assumptions about the proportion and information dissemination preferences of Internet users, the possibility of deviations brought by these assumptions is something that needs to be explored in more detail, many factors such as relationship between migration and development are not included in the analysis. Compared with traditional demographic research focused on inter-provincial migration, use of simulations of Baidu index big data, which is rarely considered by traditional studies, allows this paper break new ground in its effort to study current migration paths for population movements between cities, and to calculate the probability and development of migration propensity between cities in the future.