1 Introduction

To respond to challenges caused by high gasoline prices, traffic congestion, and greenhouse gas emissions, smart investment on intercity rail transit infrastructures and service suggests a rekindling of many countries’ (such as China and USA) interest in offering a range of benefits over automobile travels [1]. Previous studies suggested that the demand for increasing intercity passenger rail transit lies in the region where population density and corridor congestion make intercity rail service a competitive mode with the automobile services and regional air service [2]. However, the expansion of intercity rail transit systems will bring forth more equal regional accessibility but may also incur more uneven development issue [3]. Past experiences in high-speed railways indicate increased imbalances between major cities and their hinterlands [46, 3]. A specific example is cities connected through Shinkansen high-speed rail (in Japan) immediately gained location advantages, while non-Shinkansen cities were marginalized [6, 3]. Therefore, it is necessary to investigate the suitable intercity rail transit lengths or scope over a concerned region in the planning stage in order to make investment plans more substantially optimal. The core research question that this study addresses is how to investigate suitable scope of regional intercity rail transit based on social economy factors, and furthermore how the planned intercity rail transit scope may impact the social economy factors.

Besides traditional measures of social economy factors by population and GDP, this study also defines some social economy accessibility variables. According to Geurs and Van Wee [7], accessibility measures adopted in existing research can be classified into four categories: infrastructure-based measures, analyzing the performance or service level of transport infrastructure; location-based measures, analyzing accessibility at locations, typically on macro level; person-based measures, analyzing accessibility at the level of individual; utility-based measures, analyzing the welfare benefits the people derive from access to the spatially distributed activities. It implies that at the regional level, a comprehensive view of the accessibility is needed to well reflect the natures of land use, and demographic and social economic factors. Three categories of factors dictating the demand for intercity passenger rail are usually referred to: population factors (e.g., population growth and densification, increasing urbanization), economic factors (e.g., rising personal transportation costs, economic growth index such as gross domestic product—GDP), and transportation service factors (e.g., congested highways for automobile travels) [2]. These literature findings provide valuable insight to explore the method for estimating the intercity rail transit scope by taking into account the above three categories of impacting factors. In the study as presented in the paper, transportation, population, and economy accessibilities are defined. Transportation accessibility is mathematically expressed as a function of transportation network density and travel time index. Population accessibility is related to population density distribution, and economy accessibility is regarding the local GDP of a concerned area. Thus, this research has combined the infrastructure-based measures and location-based measures to reflect comprehensive accessibility at regional level, among which the transportation accessibility belongs to infrastructure-based measure, while the population and economy accessibilities belong to location-based accessibility.

This study chooses the Pearl River Delta (PRD) region in Guangdong Province, China, as the study case, which is used to test the proposed method. In this PRD case study, three “accessibility” variables, including transportation, population, and economy accessibilities, are defined as impact factors as investigating the intercity rail transit scope (measured by length). The relationship between the intercity rail transit length and the above three categories of factors is firstly estimated based on a set of models that are developed by using the data of 254 intercity rail transit systems in different countries over the world. On the top of it, the accessibility-based approach for spatial suitability analysis of the intercity rail transit planning is developed and tested with data of the PRD region as of 2012. The models involved in the accessibility-based approach are embedded into the ArcGIS system to visualize the result. To make the investment maximally beneficial to the regional transportation and associated economic development, the result from the study is anticipated to provide a valuable reference to other regions in China and even other countries with similar needs.

2 Related Research

Although the impact of intercity rail transit on the regional development has attracted much attention from the transportation studies [46, 3], there is little direct research on quantitative investigation about the suitable scope of rail transit, especially at regional level. A few studies have analyzed the social economy conditions for building urban rail systems. For instance, Babalik-Sutcliffe [8] has analyzed eight new urban rail systems in the USA, UK, and Canada by qualitative case studies and identified some important factors for the success of these systems, including urban factors, planning factors, operating policies, urban planning policies, and external factors, among which local economic conditions and population and land use density are factors that need to be considered before the planning stage. Cao and Lin [9] have built a regression model to explore the suitable metro length for cities with different social economy factors, based on data from 121 cities around the world. Loo and Cheng [10] have argued that the cities with a population size of 5 million or above and a GDP of US$11,400 (2008 dollars) per capita are suitable for building metro systems, based on data of 60 metro lines in 21 cities on different continents. However, the discussion about the relationship between suitable rail transit scope and regional and urban social economy condition is still limited. Thus, this study tries to fill the research gap by investigating the suitable intercity rail transit based on regional social economy factors and quantitative analysis.

Traditional studies generally use population, GDP, and land use to measure the social economy factors, while accessibility provides a new measure for regional social economy factors. In geographical transportation field, the accessibility is usually referred to the ease of reaching desired roadway infrastructures or destinations in a transportation network [11, 12]. In a Geographic Information System (GIS) environment, accessibility could be conceptually interpreted in reflecting spatial interactions between transportation services and land use setting as well as facilitation to use the transportation systems in meeting people’s social economic needs. Many previous studies have associated the accessibility with potential opportunities of spatial interactions between the transportation activities and land use and social economic factors [13].

The relationship between the accessibility and land use is widely recognized in planning for both urban and rural environments [1419]. Other social economic factors are also tied with the accessibility, such as poverty, health, and production patterns [20]. Jalan and Ravallion [21] ’s study indicates that rural roads are a key factor in explaining why some rural households are able to increase their consumption levels than others. Bryceson et al. [22] interpret the complex and conditional linkages between rural road networks and poverty reduction in terms of mobility enhancement, such as enhanced possibilities to travel for work or services, and accessibility enhancement such as increased attractiveness of rural areas for service infrastructure and staffing.

Hanson [16] argued that the accessibility constitutes of two main parts: proximity and mobility. And the accessibility was estimated as travel time, which depends upon both proximity and mobility. In some studies, the accessibility was modeled by simple measures, such as distances from roads, towns, and other destinations [15, 18]. In other studies, the accessibility was also examined by comparing those variables in different villages with an interview-based approach [23]. Typically, Euclidian distance, measuring the distance to roads, was often used as a simple measurement of the accessibility [15]. While no standards or mature quantitative accessibility models have been found from the literature review, the literature inspired the authors to explore an innovative approach to redevelop a set of regional accessibilities that can be measured and analyzed in a GIS environment.

3 Research Methodology

3.1 Study Case

The PRD region includes nine cities of Guangdong Province in China and is one of the most densely populated regions in China. It has experienced the most rapid urban sprawl over the past decades. In 2008, China announced plans to mesh major cities within the PRD regions, namely Guangzhou, Shenzhen, Dongguan, Zhaoqing, Foshan, Huizhou, Jiangmen, Zhongshan, and Zhuhai, into a single megacity. A series of massive infrastructure projects have been performed to merge transport, energy, water, and telecoms networks across these nine cities. A relentless development has been taken place since then. The World Bank recently named the PRD region as the biggest urban area in the world in terms of population and geographical size [24]. The PRD is hence a good exemplary region to be used as a case study.

3.2 Methodology

First, in order to check the suitability of the proposed intercity rail transit system with the local conditions in a region (e.g., the PRD region in Guangdong Province, China), we first set up the profile of relationship between the intercity rail transit scope and social economy.

The length of a regional intercity rail transit (L) is used to measure the regional intercity rail transit scope. Regional population (P), land use area (S), and GDP are viewed as impact social economic factors to be correlated with the lengths of the corresponding intercity rail transit scope. The datasets of 254 intercity rail transit systems from different countries are collected, including 129 samples in Europe, 60 samples in North America, 38 samples in Asia, 13 samples in South America, 7 samples in Pacific, and 7 samples in Africa, which cover both the developed and developing world. These samples are used to explore the correlation between L and P, S, and GDP. The data are mainly from statistic reports about world urban areas [2527]. The “Enter” and “Stepwise” methods are then used to develop the regression models that mathematically relate the intercity rail transit length and population, land use, or GDP. Then, the model is applied the PRD region to estimate the suitable scope of intercity rail transit based on the social economy factors in this region. The population, land use area, and GDP data at the city level, the county level, and the town level are from the statistic yearbooks of the cities and counties of PRD.

Second, three accessibility variables were defined to check the modeled rail transit scopes. The transportation accessibility is redefined by integrating the density of transportation network and distribution of shortest travel times from the centroid center of a town (or node) to all other town nodes. The transportation system in the PRD region includes highway network and intercity rail transit system. The highway network consists of freeway, highway, state highway, and county highway. The railway system includes single route and round-route rail transit lines. The density of the transportation network (R i ) is measured by the following equation.

$$R_{i} = \mathop \sum \limits_{i = 1}^{n} \frac{{m_{i} l_{i} }}{S}$$
(1)

where m i  = land area per unit length of the ith highway or railway which are located within this area; l i  = length of the ith highway or railway within the concerned area; and S = total area of the concerned town.

The values of m i are obtained from the Highway and Railway Construction Specification and Standards of Guangdong Province, China. However, values for state highway and county highway are not recommended in the Standards. The m i values are assumed based on the classifications of the highways as the First Class through the Fourth Class, as indicated in the Standards. Although little bias would possibly exist in this way, they could be neglected from the standpoint of the regional spatial distributions. The m i values for railways are decided by the specifications for the First Class single route electric power railway.

Then, all the shortest travel times from the centroid center of a town (or node) to all other town nodes are estimated. It is defined as transportation accessibility A i for a specific town node and estimated by using the following equation.

$$A_{i} = \mathop \sum \limits_{j = 1}^{n} T_{ij} /n$$
(2)

where T ij is the shortest travel time from node i to node j and n is the total destined nodes relative to node i.

Transportation network density is estimated by the ratio of the right-of-way areas of transportation infrastructures over the total land area within which the above infrastructures are located. The density value ranges from 0 to 1. The larger the value is, the higher the transportation network density is. Since A i is measured in min, it is necessary to convert its measurement unit into a non-measurement unit like R i by using the following equation. The converted A i is denoted as UA i .

$${\text{UA}}_{i} = \frac{{A_{i} - {\text{Min}}T_{ij} }}{{{\text{Max}}T_{ij} - {\text{Min}}T_{ij} }}$$
(3)

where MinT ij  = minimum shorted travel time from node i to j; MaxT ij  = maximum shorted travel time from node i to j.

Unified transportation network density (R i ) and transportation accessibility (A i ) are averaged to get an integrative transportation accessibility (TA i ) as estimated by the following equation.

$${\text{TA}}_{i} = \left( {R_{i} + {\text{ UA}}_{i} } \right)/2$$
(4)

Population and economy accessibility for each node are also defined and quantified. Both of them are defined by aggregating spatial interaction models based on Newton’s law of gravitational force to represent the attraction force between two nodes in direct proportion to the attraction size and in inverse proportion to travel resistance. Those two concepts are quantified by the following two equations.

$${\text{PA}}_{i} = \sum\limits_{j = 1}^{n} {\frac{{P_{j} }}{{D_{ij}^{a} }}}$$
(5)
$${\text{EA}}_{i} = \sum\limits_{j = 1}^{n} {\frac{{E_{j} }}{{D_{ij}^{a} }}}$$
(6)

where PA i  = population accessibility at node i; EA i  = economy accessibility at node i; P j  = population size at node j; E j  = economic level at node j; D ij  = shortest travel time from node i to node j; and α = resistance coefficient (default is 1).

4 Models of Intercity Rail Transit Scope

4.1 General Rail Transit Scope Models Using Global Datasets

Table 1 shows 16 developed models regarding the rail transit length and associated impact factors as a result of the regression modeling with the 254 sample data. The statistical analysis indicates that the intercity rail transit size is most highly correlated with the regional GDP level. Population is the second highest factor in the correlation. Land use is relatively lower in the correlation. Models (a), (d), (g), (i), (l), and (n) are given by using the “Enter” method and other models are given by using “Stepwise” method. There may be other factors influencing the rail transit size, such as policy, investment, technology, and other natural conditions. To address those factors, rail transit length per people versus population density (people per square kilometers) and rail transit length per people versus GDP density (GDP per people) are analyzed.

Table 1 Regression models for rail transit length and social economic factors

4.2 Estimating Rail Transit Lengths for the PRD Region

The models as shown in Table 1 can be used to estimate the compatible size of the intercity rail transit systems with the PRD region. Specifically, the regression models (a), (b), (c), (q), and (r) are used to calculate the intercity rail transit sizes. As a result of comparing different model forms, the exponential model is found best fitting the sample data, as expressed by models (q) and (r) (Table 1). The results suggest that the level of intercity rail transit length per people is longer as the level of the GDP/P is higher. The Intercity rail transit density is lower as the population density is higher. This is possibly because the intercity rail transit systems connect cities with high population densities and areas where the land use has larger open spaces between cities could be less costly utilized.

As shown in Table 2, the total intercity rail transit length is projected as about 2293 km by using the model, 17.5% longer than the planned length of 1890 km (as of 2012). Comparing with the available 254 existing intercity rail transit systems in other countries, the intercity rail transit lengths are relatively shorter in the PRD region. Model (q) is the regression model that is based on per capita economic factor. The estimated length resulted from the model (q) is lower than actually planned length. The estimation gives a room to accommodate the extra demand that may be caused by other unknown factors, thus can be viewed as proactive forecasting in planning.

Table 2 Comparison between estimated and planned total rail transit lengths

The PRD region case study focuses on nine cities: Guangzhou, Shenzhen, Zhuhai, Foshan, Jiangmen, Dongwan, Zhongshan, Huizhou, and Zhaoqing. A total of 373 towns (nodes referring to centroid centers of concerned towns) and 259 streets (links) are included in the study area. The intercity rail transit systems are classified into four classes in the planning. Since every county is referred to just one local node over the network, the above models (a), (b), and (c) are inappropriate to check the suitability of the rail transit planning size to each concerned town. The models (q) and (r) are used to estimate the intercity rail transit length for each class. In this study, the data used in the models include population density and per capita GDP of every node, and then the regional intercity rail transit size of every node is obtained by using the model (q) and (r). The average intercity rail transit length of every node ranges from 0.3 to 37.3 km. The node with the largest rail transit size is located in Nanshan, and the node with the smallest rail transit size is located in Gaoyao. Clustering analysis results in four classes of the transit sizes, as shown in Table 3.

Table 3 Intercity rail transit size based on society and economy development

As a result of running the models (q) and (r), the first class of the intercity rail transit length, including Guangzhou Tianhe and Shenzhen Nanshan, ranges from 35 to 38 km in average. The second class includes four nodes located in Guangzhou and Shenzhen, with average length ranging from 9.091 to 31.658 km. The third class, including 20 nodes in Guangzhou, Shenzhen, Zhuhai, Foshan, and Dongwan, is projected with average length of the intercity rail transit routes from 14.183 to 23.094 km. The fourth class includes 347 nodes with smaller average rail transit length.

4.3 Classifying Intercity Rail Transit Lengths at City Level

The estimated average intercity rail transit lengths are further broken down into five classes for each of concerned cities, from shorter to longer length. The result is summarized in Table 4. The result indicates that Guangzhou and Shenzhen obviously need the largest intercity rail transit size, followed by Zhuhai, Foshan and Dongwan. Huizhou, Zhongshan, and Jiangmen also need large intercity rail transit size, even though their need is less than the above cities based the model result. To eliminate the impact of land use area, or just consider the density of the intercity rail transit length per unit area, the transit density is estimated for all the towns and classified into five classes. The first class basically includes the center zone of almost every city in the PDR region such as Guangzhou, Shenzhen, and Dongwan. Foshan, Nanmeng, and Guzhen, as well as part of Guangzhou and Shenzhen, are classified into the second class. The third class mainly covers part of Foshan, Zhongshan, Dongwan, and Shenzhen.

Table 4 Estimated intercity rail transit size for each city

5 Checking Modeled Transit Scopes with Accessibilities

5.1 Spatial Distributions of Population and Economy Accessibilities

The PRD region covers several cities of the most densely populated in China. The PRD regional population density is defined in ten classes with the highest density of 38,458 persons per kilometer and the lowest density of 21 persons per kilometer. To better understand the relationship between population density and transportation development, population density distribution and transport network are overlapped in the ArcGIS platform, as shown by Fig. 1. The current transport network distribution is generally in accordance with the population density distribution. The population accessibility distribution is shown in four classes in Fig. 2. Guangzhou, Dongwan, and Shenzhen have the highest population accessibility. Guangzhou, Foshan, Dongwan, and Shenzhen have the secondly highest population accessibility. Jiangmen, Huizhou, Zhuhai, and Conghua have the lowest population accessibility. It shows that the PRD region is centralized in Guangzhou with the areas of the highest population accessibility, and the population accessibility level becomes lower with farther from the central areas.

Fig. 1
figure 1

Overlapped population density and traffic network

Fig. 2
figure 2

Population accessibility distribution

Similarly, economy accessibility distribution is calculated by Eq. (6). Guangzhou remains the center position in terms of economy accessibility. Figure 3 shows the economy accessibility distributions in the PRD region. Guangzhou, Dongwan, and Shenzhen are the highest economy accessibility areas. Jiangmen, Huizhou, Zhuhai, and Conghua are the lowest economical accessibility areas. Of the 373 sample nodes in the PRD region, the population accessibility and economy accessibility are pretty correlated and their correlation coefficient is close to 1.

Fig. 3
figure 3

Economy accessibility distribution

5.2 Transportation Accessibility with Intercity Rail Transit Planning

The transportation infrastructural system provides a “backbone” influencing the transportation accessibility. Accessibility reflects the ability of a location providing people with access opportunities to human life and associated economy activities through connecting land use and transportation facilities. To investigate spatial distribution of transportation accessibility, we use the “natural cutting” method to visualize the defined index in each town included in the PRD region. A total of 373 nodes in the study area are classified into three classes by the ratio of transportation land usage to the land area that covers the concerned transportation infrastructures (or transportation land use ratio, R i ). The first class area is defined as the area with the transportation land use ratio ranging from 3.06 to 7.93%. The first class land use covers 29 nodes. The land surrounding Guangzhou Baiyun Songzhou Road has the highest transportation land use ratio (7.93%). For the second class land, transportation land use ratio ranges 1.37–3.06% and 134 nodes are covered, mainly distributed in Guangzhou, Foshan, Dongwan, and Shenzhen. The third class land includes 210 nodes with transportation land use ratio lower than 1.37%. Thus, the layout of the transportation infrastructure development appears like “center + radiation” pattern in the PRD region. The central areas are mainly located in Guangzhou and Foshan, and the second center is in Shenzhen. Those central areas are connected with peripheral areas through freeways. Less highway infrastructures have been developed in the skirting areas. Guangzhou–Shenzhen–Hongkong corridor has been formed in the east area of the PRD region. The transportation land use ratio is much lower in western and northern areas, which is a vulnerable factor in propelling the development inside the PRD region.

The shortest average travel times from each traffic analysis zone (i) to all other zones (j) are calculated, and then A i values are estimated by using Eq. (2). Of the 373 concerned nodes, the node located in Guangzhou Jinsha Street has the lowest A i value (i.e., 49 min). The worst accessibility node is located in Huizhou Huidong town with shortest average travel time to other node at 158 min. The overall shortest average travel time for all 373 nodes is 76 min. 21% of the nodes have the shortest average travel times of less than one hour, and most of them are located in Guangzhou. 80% of the nodes have the shortest average travel time of 90 min, and 3.2% with 120 min. Those nodes mainly located within Huizhou and Jiangmen. With A i values for all the nodes, transportation accessibility index is then calculated and the result is illustrated by Fig. 4. The result indicates that there are 214 nodes where the accessibility A i is higher than the average level (76 min) in the PRD region, and they are located in 57.4% of the counties in the region.

Fig. 4
figure 4

Transportation accessibility distribution

The integrated transportation accessibility (TA i ) is classified into six levels by using the “natural cutting” method in the ArcGIS environment (Fig. 4). Sixteen nodes at the first level have the transportation accessibility of higher than 0.689, most of which are located in Guangzhou and Foshan except for two nodes in Zhongshan and Shenzhen. Sixty-three nodes’ transportation accessibility ranges from 0.569 to 0.688 at the second level, of which 26 nodes are located in Guangzhou and 13 located in Foshan. There are 103 nodes that are located in Guangzhou, Dongwan and Shenzhen at the third level. There are 116 nodes in Jiangmen and Huizhou at the fourth level. There are 72 nodes in Huizhou, Jiangmen, Zhaoqing, and Zhuhai. The transportation accessibility distribution appears being centralized in Guangzhou and Foshan. Dongwan and Shenzhen are the second centers. The transportation accessibility strength diffuses around the above centers out-toward skirting areas. The accessibility is higher in east coastal areas than inner western areas of the region.

With incorporated intercity rail transit into the GIS system, the density of the transportation (R i ) would not change much, while the shortest travel time of each town or street (A i ) would change greatly. As shown in Fig. 5, it is found that the intercity rail transit planning can lead 20–30% improvement of A i in 199 towns/streets and most of them are distributed in the peripheral areas of PRD region. Zhuhai gets the greatest A i improvement, whereas Guangzhou and Foshan with the highest original transportation accessibility got the least A i improvement. Thus, the planned rail transit planning can bring more equal regional accessibility in PRD region.

Fig. 5
figure 5

Transportation accessibility change distribution

6 Conclusion and Discussion

Since there is no standard for determining the rail transit size, the authors developed a new method by using the regression technique to correlate the sizes of 254 regional intercity rail transit systems in different countries in the world with regional populations, land use areas, and economical (or GDP) factors. The scope models [i.e., models (a), (b), (c), (q), and (r)] resulted from this analysis are used as a reference benchmark for estimating the intercity rail transit size for the PRD region.

Based on the rail transit scope estimation, this research further provided an exemplary framework for the accessibility-based analysis about intercity rail transit planning and visualizing the results in the GIS environment. In this framework, three accessibility measures, including transportation, population, and economy accessibilities measures, were adopted. The application of this framework in PRD suggested that these three accessibilities were distributed with center in Guangzhou and became lower with farther from the central areas. However, with the planned rail transit, the shortest travel time of peripheral areas of PRD got larger improvement than the central areas, such as Guangzhou and Forshan, and the rail transit brought more equal regional accessibility in PRD region.

Based on the case study of PRD region, both the models for estimating intercity rail transit systems scope and the framework for analyzing the accessibility impact of rail transit scope can be applied to other metropolitan areas and even other transport modes. But we have to admit that it has some limitations in applying the accessibility models, especially in terms of only considering population and GDP as the social economic factors influencing location and travel demand. In the future, more factors will be comprehensively considered in the accessibility-based analysis, such as employment, industry, and poverty factors.