1 Introduction

In the current era of big data, public transit agencies are benefiting from the large amounts of archived data generated by automatic fare collection (AFC) systems, which can offer detailed and accurate information not easily derived from manual surveys, and accordingly enable the development of various practical applications [1,2,3]. Nevertheless, the potential of these detailed automated data is not fully leveraged in studies of passenger flow characteristics and station attributes from a network perspective. On the other hand, the reliability of the services of urban rail transit networks (URTNs) plays an essential role in relieving heavy urban traffic congestion and meeting the needs of passengers and managers [4,5,6,7]. As one of the rail service reliability indicators, time reliability (TR) is of particular concern to operators, as it affects the efficiency and service quality of transit networks [8, 9]. However, due to equipment malfunctions, driver behaviors, irregular loads, weather conditions, and a host of other factors [10], vehicles often suffer from delays in URTNs, causing poor TR with detrimental impacts on passenger satisfaction and system efficiency, especially during peak hours [11]. Hence, TR is a critical attribute for determining the efficiency and attractiveness of URTN stations. In addition, developing a reasonable ranking method to identify station importance in URTNs has received considerable attention [12, 13] and was characterized in previous research as a highly challenging issue [14, 15]. It can be regarded as a multiple-attribute decision-making (MADM) problem, requiring the combined treatment of different attribute characteristics of individual stations.

Based on the above discussion, the objective of this paper is to use AFC data for identifying station importance in URTNs. Although there are many centrality measures that can be applied to evaluate the importance of stations in URTNs, the different centrality measures have certain shortcomings and limitations in various aspects and do not incorporate the unique attributes of the URTNs, especially the TR. Therefore, we first collect and process a large amount of AFC data to show the overall spatial and temporal characteristics of passenger flows, as well as the imbalance in passenger flow directions at individual stations and lines of the URTN. Based on the traditional definition of TR, a new comprehensive metric of daily TR for stations is proposed and analyzed for correlation with other centrality measures. Then, the centrality measures incorporating TR as multi-attributes of rail transit stations are utilized in the weighted TOPSIS (Technique for Order Performance by Similarity to Ideal Solution) algorithm to identify the combined importance of stations in the URTN. To evaluate the performance of the method, we explore the dynamic robustness of the URTN under deliberate attacks to examine the significance of stations ranked by different centrality measures in maintaining network connectivity. Finally, we verify the feasibility and validity of the model by adopting Beijing's URTN as a case study and performing a sensitivity analysis of the weights of the measures.

2 Literature Review

2.1 TR of Transit Networks

Travel time is one of the most critical characteristics of trips, while travel TR is considered to be of equal or greater significance than travel time in commuters' travel decisions [16, 17]. Traditional reliability indicators are operator-oriented due to the lack of actual travel data on individual users [18]. In practice, commonly used indicators focus either on punctuality, i.e., the degree of adherence to the scheduled departure time, or on regularity, i.e., headway time variation [19]. From the passenger's perspective, the ideal situation requires the provision of door-to-door journey indicators, rather than station-based punctuality or regularity [10]. Some researchers have noted that the 95th percentile of travel time should be taken into account in addition to the average travel time, as passengers must budget extra time for variations in actual travel time [20]. Sun and Xu [2] investigated the travel TR of the Beijing Subway from the perspective of platform congestion in the morning peaks. Carrel et al. [21] used smartphone and vehicle location data from the Muni network in San Francisco to measure transit travel experiences. Gittens and Shalaby [22] evaluated 20 indicators and proposed a new journey time buffer index that uses bus stop wait time estimates to capture both waiting and travel time variation, with an application of the new indicator to the London Transit Commission's bus network. Engelson and Fosgerau [23] considered three types of measures of traveler costs for travel time variability used in travel demand modeling and transportation economics. Woodard et al. [24] collected global positioning system (GPS) data from mobile phones or other detection vehicles and presented a method to provide accurate travel TR predictions for a complete large road network. Fu and Gu [25] explored the impact of a new metro line on the flow distribution, travel time, and TR of different categories of metro commuters. Gu et al. [26] introduced the concept of network reliability and explored the characteristics and interrelationships among reliability, vulnerability, and resilience of transportation networks. Liu et al. [7] overcame the difficulty of obtaining a large amount of passenger data by applying a Monte Carlo simulation approach to estimate TR for its incorporation into accessibility metrics. Zang et al. [27] provided a comprehensive framework for summarizing the methodological developments of travel TR modeling in transportation networks, including its characterization, assessment, and valuation, from a network perspective. Grenville et al. [28] presented a series of methods and performance measures for using Wi-Fi connectivity data to measure various aspects of customer experience and reliability, including estimating wait times and measuring origin–destination travel time variation. Overall, while recent studies have begun to develop more practical applications of TR, the characteristics of the spatial and temporal distributions of rail transit travelers have not been sufficiently investigated. Additionally, limited by the great challenges in data collection and processing, most previous research has relied on numerical simulation, which inevitably creates some discrepancies with the actual passenger flow behavior. More importantly, given the widely varied purposes and needs of daily passenger trips on weekends and weekdays, indicators should be developed for daily passenger travel to better reflect the specific characteristics of the daily passenger flow of urban rail transit systems.

2.2 Methodology for Centrality Measures

A large number of centrality measures, including degree centrality (DC) [29, 30], betweenness centrality (BC) [29, 31], and eigenvector centrality [32, 33], have been explored and widely applied in various complex networks. Lü et al. [34] suggested leader rank to identify leaders in social networks, which performs well in directed networks. Gómez et al. [35] improved the betweenness centrality measure by defining a new closeness centrality measure that explicitly considers several dimensions. To [36] introduced five centrality measures and explained whether and how these measures can be applied to network analysis of urban rail systems. Yang et al. [37] proposed a centrality measure called DCC (degree clustering coefficient), which comprehensively considers degree and clustering coefficients as well as neighbors to identify influential nodes.

Recently, some scholars have begun to focus on the importance of nodes in community structure, which is an emerging research field in network science. In essence, Newman introduced “community-centrality” when discussing the modularity matrix, which is a measure of a node's contribution potential to the group structure [38]. Community centrality is a classical centrality measure, independent of any defined partition. In addition, vitality measures are commonly used to determine the actual node contributions [39]. Newman's modularity vitality, a specific cluster–quality vitality measure, is often used as an objective function for many popular community detection algorithms, making it a natural choice for measuring the quality of clusters [40, 41]. Following these research precedents, Magelinski et al. [42] further proposed a community-aware measure based on centrality and community detection theory that distinguishes between hubs and bridges depending on Newman's modularity vitality after the removal of nodes.

Specifically in public transit networks (PTNs), passenger flow strength centrality (SC) is commonly used to reflect the dynamic behavior of passenger travel. For example, Huang et al. [43] utilized the R-space to conduct a comparative empirical analysis of passenger flow-weighted PTN in Beijing. Zhang et al. [44] designed a double-layered PTN in which the upper layer is closely related to the passenger flow strength carried by the PTN. Su et al. [45] investigated the impact of disruptions in URTN on passenger flow distribution indicators such as inbound volume, outbound volume, and transfer volume, both in individual and overall dimensions. However, these studies have focused on only single centrality measures, with each measure possessing its unique drawbacks and limitations. Therefore, research into combining different characteristics of each node to determine node ranking should be explored.

2.3 Node Evaluation Techniques in Complex Networks

In recent years, effective, efficient, and accurate identification of node importance in complex networks has attracted widespread interest in the research community. Du et al. [12] considered the challenge as a MADM problem and applied TOPSIS for the first time to identify critical nodes in complex networks, using it to determine the importance of each node by aggregating different centrality measures. Hu et al. [15] proposed a weighted TOPSIS method to rank nodes efficiently by considering different centrality measures as multiple attributes of the network. Fan et al. [46] defined an index to prioritize node importance, known as cycle ratio, which contains a wealth of information in addition to well-known benchmark metrics. However, various real-world networks such as energy, social, and transportation networks have their own unique attributes in addition to traditional metrics. It is important to note that research on the identification of key stations in transportation networks, with consideration of characteristics specific to transportation networks, is still at an early stage of exploration. Zhang et al. [47] combined the MADM model with algorithm development to calculate the importance of China Railway Express nodes, including inland nodes and seaport nodes. Yang and Song [48] classified the metro stations for Ningbo city based on the entropy TOPSIS method, which considers residents' willingness as an important indicator for pre-planning assessment. Zhang and Ng [49] designed an entropy weight method—TOPSIS ranking approach to evaluate node criticality by considering various characteristics of nodes in the URTN. Nevertheless, specifically, URTNs have unique attributes such as SC and TR, which affect the core interests of all stakeholders in transportation systems. Therefore, the influence of TR should be taken into account when evaluating the importance of stations, for which there are still gaps in existing research.

2.4 Contributions

Overall, it remains a challenge to comprehensively evaluate the station importance of URTNs, especially considering their unique attributes, such as SC and TR, which deserve in-depth research. To address the shortcomings of existing studies, the main contributions of this study are summarized as follows:

  1. (i)

    This study is the first to apply SC and TR, which are core factors that influence all stakeholders of transportation systems, to identify station importance in URTNs. Not limited to the individual centrality measures, the paper proposes the weighted TOPSIS method to aggregate the different centrality measures as multiple attributes in order to obtain an importance ranking for each station.

  2. (ii)

    We have collected and processed a large amount of real passenger flow data to show the temporal and spatial distribution of passenger flows as a whole, as well as imbalances in the direction of passenger flows. These flow characteristics can be used to better provide transportation services. In addition, in view of the vastly different purposes and needs of daily passenger trips on weekends and weekdays, the design and development of the daily TR analysis indicator and analytical comparisons of individual centrality measures on each day can reveal the real travel experience of rail transit riders.

  3. (iii)

    This paper provides a new research framework for comprehensively identifying critical stations in urban PTNs. This study can also contribute to guidance for the planning and operation of rail transit systems and inform the effective design of station protection strategies to meet various policy-oriented demands that focus on different measures.

3 Methodology

In this paper, based on AFC data, we attempt to incorporate TR and combine different centrality measures as multiple attributes of weighted TOPSIS to explore how to identify crucial stations in URTNs. This section presents the basic theory and methodology regarding centrality measures, and TR, weighted TOPSIS, and the identification algorithm are introduced.

3.1 Traditional Centrality Measures

A large number of traditional centrality measures have been used to identify station importance in URTNs, such as DC, BC, and SC.

3.1.1 DC

DC is a commonly used metric to describe the importance of a node in a complex network, and is defined as the number of other nodes connected to the node. In the context of URTNs, the higher the DC value, the higher the connectivity and importance of the stations, which are generally hub stations. The DC \(C_{D} ({\text{i}})\) of station i is calculated by the following formula:

$$C_{D} (i) = \sum\limits_{j = 1}^{N} {a_{ij} }$$
(1)

where \(N\) is the total number of stations; \(a_{ij}\) is the network adjacency matrix element—if station \(i\) is connected to station \(j\), \(a_{ij} = 1\); otherwise, \(a_{ij} = 0\).

3.1.2 BC

BC indicates the significance of nodes from a global perspective. It is defined as the fraction of the shortest paths between all node pairs that pass through the node [29, 31]. In an URTN, BC \(C_{B} (i)\) of station \(i\) reflects its importance in terms of the number of OD pairs that need it along the shortest route in the transit network. As a result, the greater the BC, the greater the station's importance in the network, which can be expressed as

$$C_{B} (i) = \sum\limits_{s,t \ne i} {\frac{{n_{st} (i)}}{{n_{st} }}}$$
(2)

where \(n_{st}\) represents the number of shortest paths between station \({\text{s}}\) and station \({\text{t}}\); \(n_{\text{st}} {\text(i)}\) represents the number of shortest paths between station \({\text{s}}\) and station \({\text{t}}\) passing through station \({\text{i}}\).

3.1.3 SC

The SC of the station reflects the actual passenger carrying capacity of the station in an URTN. The higher the value of SC, the greater the ability of the station to produce/attract passengers. The SC is divided into inbound SC and outbound SC according to the different directions of passenger flow; thus the total SC \(C_{{\text{S}}}^{{{\text{total}}}} (i)\) of station \(i\) can be defined as

$$C_{{\text{S}}}^{{{\text{total}}}} (i) = C_{{\text{S}}}^{{{\text{inbound}}}} (i) + C_{{\text{S}}}^{{{\text{outbound}}}} (i)$$
(3)

where \(C_{{\text{S}}}^{{{\text{inbound}}}} (i)\) and \(C_{{\text{S}}}^{{{\text{outbound}}}} (i)\) denote the inbound and outbound passenger flows at station \(i\), respectively.

3.2 TR

TR is often used to reflect the level of service of the transit network from the traveler's perspective, and it is a major determinant of individual activity and travel choice behavior. In addition, TR is an important transportation service performance indicator that measures the likelihood of passengers arriving at their destinations on time. Many studies have considered the 95th percentile travel time from one station to another as an indicator of travel time reliability, implying a 95% probability of passengers arriving at the destination station on time. The reliability buffer time metric (RBT) is the amount of slack time added to a traveler's typical travel time to ensure on-time arrival at the destination. The higher the RBT, the less reliable the service. The RBT from station \({\text{i}}\) to station \({\text{j}}\), represented as \({\text{RBT}}_{{{\text{ij}}}}\):

$${\text{RBT}}_{ij} = {\text{TT}}_{ij}^{95\% } - {\text{TT}}_{ij}^{{{\text{aver}}}}$$
(4)

where \({\text{TT}}_{ij}^{95\% }\) and \({\text{TT}}_{ij}^{{{\text{aver}}}}\) are the 95th percentile travel time and the average travel time from station i to station j, respectively.

The buffer time index (BTI) provides an indication of how much extra “buffer” time travelers need to budget to arrive on time in 19 out of 20 trips (i.e., 95% of the trips). It is based on the ratio of the difference between the 95th percentile travel time and the average travel time to the average travel time and is commonly used to evaluate TR:

$${\text{TR}}_{ij} = \frac{{{\text{TT}}_{ij}^{95\% } - {\text{TT}}_{ij}^{{{\text{aver}}}} }}{{{\text{TT}}_{ij}^{{{\text{aver}}}} }} \times 100\%$$
(5)

where \({\text{TR}}_{ij}\) is the TR from station i to station j.

In addition, the TR of station i in an URTN can be defined as the average TR of the station with other stations, which can be denoted as

$${\text{TR}}_{i} = \frac{1}{N} \cdot \sum\limits_{j = 1, \, j \ne i}^{N} {{\text{TR}}_{ij} }$$
(6)

where \({\text{TR}}_{i}\) is the TR of station i. Furthermore, we can define the daily TR of station i as

$${\text{TR}}_{i}^{d} = \frac{1}{N} \cdot \sum\limits_{j = 1, \, j \ne i}^{N} {{\text{TR}}_{ij}^{d} }$$
(7)

where \({\text{TR}}_{i}^{d}\) is the daily TR of station \(i\); \({\text{TR}}_{ij}^{d}\) is the daily TR from station \(i\) to station \(j\); d denotes date.

3.3 Weighted TOPSIS

Among the various MADM approaches, TOPSIS is one of the most commonly used ranking measures, as proposed by Hwang and Yoon [50]. TOPSIS aims to rank the evaluation objects by measuring the distance between them and the positive and negative ideal solutions. The positive ideal solution minimizes the cost criteria and maximizes the benefit criteria at the same time. The negative ideal solution, on the other hand, is the opposite. Thus, the evaluation objects reach their best state when they are closest to the positive ideal solution and farthest from the negative ideal solution. The specific steps of the weighted TOPSIS algorithm are described in detail as follows:

Step 1 Let us consider a multi-attribute information decision matrix \(D = (x_{mn} )\), which consists of alternatives (rows) and criteria (columns):

$$D = \left[ {\begin{array}{*{20}c} {x_{11} } & \cdots & {x_{1j} } & \cdots & {x_{1n} } \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ {x_{i1} } & \cdots & {x_{ij} } & \cdots & {x_{in} } \\ \vdots & \ddots & \vdots & \ddots & \vdots \\ {x_{m1} } & \cdots & {x_{mj} } & \cdots & {x_{mn} } \\ \end{array} } \right].$$
(8)

Step 2 Normalize the decision matrix \(D\):

$$y_{ij} = \frac{{x_{ij} }}{{\sqrt {\sum\nolimits_{i}^{m} {x_{ij}^{2} } } }},\;i = 1, \ldots ,m;\;j = 1, \ldots ,n.$$
(9)

Step 3 Confirm the positive ideal solution denoted as \(A^{ + }\) and the negative ideal solution denoted as \(A^{ - }\). They can be described as follows:

$$A^{ + } = \left\{ {y_{1}^{ + } ,y_{2}^{ + } , \ldots ,y_{n}^{ + } } \right\} = \left\{ {\left( {\mathop {\max }\limits_{i} y_{ij} |j \in K_{{\text{b}}} } \right)\left( {\mathop {\min }\limits_{i} y_{ij} |j \in K_{{\text{c}}} } \right)} \right\}$$
(10)
$$A^{ - } = \left\{ {y_{1}^{ - } ,y_{2}^{ - } , \ldots ,y_{n}^{ - } } \right\} = \left\{ {\left( {\mathop {\min }\limits_{i} y_{ij} |j \in K_{{\text{b}}} } \right)\left( {\mathop {\max }\limits_{i} y_{ij} |j \in K_{{\text{c}}} } \right)} \right\}$$
(11)

where \(K_{{\text{b}}}\) is the set of benefit criteria and \(K_{{\text{c}}}\) is the set of cost criteria.

Step 4 Obtain the separation measures of the existing alternatives from the positive ideal and negative ideal solutions. The separation measures can be calculated by the Euclidean distance and taking into account the influence of the associated weights, which are denoted as \(S_{i}^{ + }\) and \(S_{i}^{ - }\), respectively:

$$S_{i}^{ + } = \sqrt {\sum\limits_{j = 1}^{n} {w_{j} * (y_{j}^{ + } - y_{ij} )^{2} } } ,\;i = 1, \ldots ,m;\;j = 1, \ldots ,n.$$
(12)
$$S_{i}^{ - } = \sqrt {\sum\limits_{j = 1}^{n} {w_{j} * (y_{j}^{ - } - y_{ij} )^{2} } } ,\;i = 1, \ldots ,m;\;j = 1, \ldots ,n.$$
(13)

where \(w_{j}\) is the weight for \(j\) criterion.

Step 5 Finally, calculate and normalize the relative closeness to the ideal solution:

$$C_{i} = \frac{{S_{i}^{ - } }}{{S_{i}^{ - } + S_{i}^{ + } }},\;i = 1, \ldots ,m.$$
(14)
$$\mathop {C_{i} }\limits^{\sim } = \frac{{C_{i} }}{{\sum\limits_{i = 1}^{m} {C_{i} } }},\;i = 1, \ldots ,m.$$
(15)

where \(C_{i}\) and \(\mathop {C_{i} }\limits^{\sim }\) are standard and normalized relative closeness, respectively. The alternatives are ranked according to their relative closeness to the ideal solution: an alternative with a higher value of \(\mathop {C_{i} }\limits^{\sim }\) is assumed to be more important.

3.4 Identification Algorithm for Station Importance

In this paper, we apply a weighted TOPSIS approach to identifying the URTN’s crucial stations, considering DC, BC, SC, and TR as multiple attributes. The specific steps of the proposed method are as follows:

Step 1 Construct the L-space URTN. The L-space URTN is a topological mapping of the actual network, i.e., treating stations as nodes and line segments connecting two adjacent stations as links.

Step 2 Calculate the values of the different centrality measures to obtain the multi-attribute matrix. Combining the SC and TR metrics of the unique properties of the URTN, the DC, BC, SC, and TR are calculated by Eqs. (1), (2), (3), and (6)/(7), respectively, which are considered as multi-attributes (i.e., the columns of matrix \(D\) represent the values of the different centrality measures). The attributes of these measures including name, data type, and classification are described in Table 1.

Table 1 The attributes of DC, BC, SC, and TR

The matrix \(D\) can be represented as follows:

$$D = \left[ {\begin{array}{*{20}c} {DC_{1} } & {BC_{1} } & {SC_{1} } & {TR_{1} } \\ {DC_{2} } & {BC_{2} } & {SC_{2} } & {TR_{2} } \\ \vdots & \vdots & \vdots & \vdots \\ {DC_{m} } & {BC_{m} } & {SC_{m} } & {TR_{m} } \\ \end{array} } \right].$$

Step 3 Normalize the multi-attribute matrix \(D\). The normalized matrix \(D\) is acquired by Eqs. (8) and (9).

Step 4 Compute the separations of the stations from the positive ideal solution \(A^{ + }\) and the negative ideal solution \(A^{ - }\) with weights assigned through Eqs. (12) and (13). We apply the Euclidean distance to determine the separation measures \(S_{i}^{ + }\) and \(S_{i}^{ - }\) from the positive ideal and the negative ideal solutions, and the specific process is shown in Fig. 1.

Fig. 1
figure 1

The computational process of the separation measures \(S_{i}^{ + }\) and \(S_{i}^{ - }\)

Step 5 Calculate and normalize the relative closeness \(\mathop {C_{i} }\limits^{\sim }\) to the ideal alternative of rail transit stations. The alternatives with larger relevant closeness \(\mathop {C_{i} }\limits^{\sim }\) decided by Eqs. (14) and (15) are assumed to be more significant and should be given priority as more station importance. Finally, the importance of the rail transit station is evaluated using the value \(\mathop {C_{i} }\limits^{\sim }\). A flow chart of the proposed method is shown in Fig. 2.

Fig. 2
figure 2

Flow chart of the proposed method

4 Case Study

4.1 Data

We take Beijing’s URTN, one of the busiest and most complex rail transit systems in the world, as a case study, and the proposed centrality measures are integrated to evaluate and visualize the station importance. As of March 2018, Beijing’s URTN covered 280 stations on 17 lines, as shown in Fig. 3 (where different colors indicate different lines). We obtain population density information in different districts of Beijing in 2018 to reflect the local transportation demand of passengers, as shown in Fig. 4. We can observe that the greater the location in the center of Beijing, the higher the population density, i.e., Dongcheng and Xicheng districts have the highest density, followed by Chaoyang and Haidian districts, and finally the other districts have relatively low densities. Districts with higher population density generally have more transportation travel requirements. Figure 5 shows the degree distribution and the degree distribution in double logarithmic coordinates of Beijing's URTN, where the fitted curve obeys the exponential distribution. This indicates that Beijing's URTN presents scale-free properties with strong heterogeneity, and its network exhibits complex network characteristics. Due to the significantly different spatial distribution characteristics of weekday and weekend passenger flows [51], we similarly collect AFC data on Beijing’s URTN for 1 week during the morning peak hour (7:00–8:00) from March 3 to 9, 2018, with an average daily morning peak passenger flow of approximately 90,000, of which the 3rd and 4th are weekends and the rest are weekdays. We visualize a large number of total passenger trips during the morning peak hour in a week to observe the spatial pattern of passenger flows, as shown in Fig. 6. Each node in Fig. 6a represents a station, and the arc between each station pair represents trips between these two stations without considering the direction. The larger the station’s node and the darker the color, the higher the passenger flows. We can observe that stations such as GM, JTL, DWL, and QNL produce/attract a large number of passengers. Additionally, the passenger flow distribution of different routes during the daily morning peak period is counted, as shown in Fig. 6b. Although the overall passenger flows are higher on the 5th to 9th (weekdays), it is surprising that they show almost the same pattern as the 3rd to 4th (weekends), i.e., the spatial distribution of passenger trips on weekdays and weekends is quite similar. Apart from that, lines 1, 6, 10, and Batong are the busiest, attracting the most trips. Thus, due to the large volume and similarity of daily data, the analyses below are selected from the 3rd to the 6th, which include two weekdays and two weekend days. When there is a huge influx of passengers on the routes and stations during the peak hour, it is highly likely that trains will be congested and delayed, which will cause an increase in the travel time of passengers. Consequently, passengers need to set aside extra time to reach their destination stations on time. To effectively identify the crucial stations in Beijing’s URTN, TR is included in the evaluation metrics.

Fig. 3
figure 3

Beijing’s URTN in March 2018

Fig. 4
figure 4

Schematic distribution of population density in different districts of Beijing in 2018

Fig. 5
figure 5

Degree distribution and the degree distribution in double logarithmic coordinates of Beijing’s URTN

Fig. 6
figure 6

Spatial distributions of passenger flows during the morning peak hour

4.2 Evaluation of Centrality Measures

4.2.1 DC and BC

The DCs and BCs of Beijing’s URTN are calculated and visualized as shown in Fig. 7. As shown in Fig. 7a, the DCs of the stations range from 1 to 5. The stations with high DC values are all transfer stations of the routes, the highest of which is the XZM station, reaching 5. In addition, the average DC of the URTN is 2.296, and the DCs of the stations are mostly concentrated at 2, accounting for 75.7% of the stations. The results in Fig. 7b indicate that the BCs of the stations range from 0 to 0.3207, with an average BC of 0.062, of which the highest is for the CYP station. More obvious is the fact that most of the high BC values of stations are concentrated in downtown locations, which means that these stations fall on a large number of the shortest routes for passenger trips.

Fig. 7
figure 7

DCs and BCs of the stations in Beijing’s URTN

4.2.2 SC

The evaluation performance of the SCs of Beijing’s URTN by applying the AFC data is shown in Fig. 8, where Fig. 8a–c are the inbound SCs, outbound SCs, and total SCs of the stations, respectively. The inbound SCs of the stations vary greatly, ranging from 17 to 7056, with an average inbound SC of 714 across the URTN. The maximum value is associated with the CY station. Similarly, the outbound SCs of the stations are quite variable, ranging from 2 to 13445 with an average value of 714. The GM station has the largest value among all stations. Nevertheless, the difference is that stations with large values of inbound SCs are mainly located at the edge of the network, while stations with large values of outbound SCs are basically at the center of the network, which indicates that Beijing's URTN carries a large amount of commuter passenger flows from the suburbs to downtown during the morning peak period. Moreover, the total SCs of the stations are essentially a combination of inbound SCs and outbound SCs, and the stations with larger values are concentrated in the center and edges of the network, ranging from 24 to 16007, with an average value of 1428, where the maximum value of the stations is the GM station. It is particularly interesting to note that for either the inbound SCs, the outbound SCs, or the total SCs, the stations on the eastern section of lines 1, 6, and Batong attract substantially higher passenger flows. This observation is attributed to the high population density in the areas of Tongzhou and Dingfuzhuang (located between Line 6 and Line Batong), the high volume of commuters traveling into the downtown area during the morning peak, and the proximity to Beijing's business centers, such as the Guomao CBD (central business district, near GM station), which is one of the three national CBDs in China.

Fig. 8
figure 8

Inbound, outbound, and total SCs of the stations in Beijing’s URTN

4.2.3 TR

Considering that the spatial distribution and directional characteristics of rail transit passenger flow vary from day to day due to the different purposes of residents' trips, especially on weekends and weekdays, we calculate and analyze the daily TRs of Beijing’s URTN by utilizing the AFC data, as shown in Fig. 9. Figure 9a–e presents the TRs of the stations on March 3, 4, 5, and 6, and a total of four days, where the 3rd and 4th are weekends and the 5th and 6th are weekdays. In general, the daily TRs of the stations differ greatly, with the largest difference being March 4 at 20.633% and the smallest difference being March 3 at 11.917%, indicating that some stations have high reliability while others have very erratic travel times. Figure 9e shows that 10.4% of stations have TRs above 10%, which means that passengers need to allow at least 10% of their average travel time to reach their destination station on time when they want to travel at these stations during the morning peak hour. A deeper examination reveals that stations with high TR values are typically stations with high passenger flows, such as the GM station, DWL station, or stations with large distances from other stations, such as the T2 and T3 stations. This situation is attributed to, on the one hand, the lack of data due to the low passenger flows at some stations, resulting in what seems to be better reliability, and on the other hand, the delays caused during vehicle operation due to the large distance of the routes between the actual stations. Furthermore, the research can not only evaluate the TRs of the stations but also guide passengers to reserve extra time for reliable travel and help the government and enterprises to make rational plans for train schedules on Beijing's URTN.

Fig. 9
figure 9

TRs of the stations in Beijing’s URTN

4.3 Correlation of Indicators

Commonly used feature selection methods include the filter method, wrapper method, and embedded method. Considering that the filter method is easier to calculate and does not depend on specific models, we choose the Pearson correlation coefficient (one of the multivariate filter methods) to explore the interrelationships among the centrality measures. Consequently, to explore the relationship between the daily TRs of the stations and the similarities and differences in the ability of various centrality measures to identify crucial stations, we investigate the Pearson correlation coefficient between them, as shown in Fig. 10a, b. Given two indicators \(X\) and \(Y\), if \(r(X,Y)\) is close to 1, it indicates that \(X\) and \(Y\) are highly positively correlated and less different from each other; if \(r(X,Y)\) is close to −1, it indicates that \(X\) and \(Y\) are highly negatively correlated, which means that the direction and magnitude of change are exactly opposite to each other; if \(r(X,Y)\) is close to 0, it indicates that \(X\) and \(Y\) are weakly correlated and very different from each other. In Fig. 10a, the 3rd and 4th are relatively similar at 0.655, the 5th and 6th have a higher correlation reaching 0.7795, while the total of four days all have a high relationship with these four days, all greater than 0.7. Regarding the centrality measures, the DC and BC have a large similarity, while the TR and SC are highly correlated, reaching 0.741. This reaffirms the conclusion that the TR is related to the passenger flows of the stations, where the higher the passenger volumes are, the more likely they are to contribute to the unreliability of the stations. However, the correlation between other centrality measures is quite unclear, which suggests that different measures provide the ability to identify significant stations from different perspectives, demonstrating their respective unique insights. We are more concerned about the comprehensive evaluation and judgment of the importance of the stations, so it is necessary to integrate these measures through the weighted TOPSIS method to ensure more complete and rich information.

Fig. 10
figure 10

Correlation coefficients of daily TRs and centrality measures

4.4 Aggregating Indicators by Weighted TOPSIS

We aggregate the indicators DC, BC, SC, and TR using the weighted TOPSIS method and present the station evaluation results of Beijing’s URTN, as shown in Fig. 11. We choose the same weight for all indicators, i.e., consider the indicators to be of equal importance, which can be changed depending on the focus of different policies of the city. The weighted TOPSIS scores of the stations vary significantly, ranging from 0.0001 to 0.0161, with an average value of the URTN at 0.0036, where the maximum value of the stations is the DWL station. Table 2 shows the top-10 ranked stations in Beijing’s URTN using different centrality measures and the proposed weighted TOPSIS method. We statistically suggest that the weighted TOPSIS method has two identical stations with DC, two identical stations with BC, nine identical stations with SC, and six identical stations with TR in the top-10 lists. This indicates that the station rankings generated by the weighted TOPSIS method contain rich identification information for different indicators and have good discrimination power. Therefore, this method can provide a more integrated, comprehensive, and reasonable evaluation of the importance of the stations, taking into account the influence of these centrality measures and avoiding the shortcomings and limitations of the single centrality indicators. The results can identify station importance more intuitively and effectively, and provide decision suggestions for the protection and setup of stations.

Fig. 11
figure 11

Weighted TOPSIS scores of the stations in Beijing’s URTN

Table 2 The top-10 ranked stations by DC, BC, SC, TR, and weighted TOPSIS in Beijing’s URTN

4.5 Comparison of Deliberate Attack Effectiveness

To assess the importance of stations in maintaining network connectivity, we study the dynamic robustness of Beijing's URTN under different metrics. Network connectivity \(E\) is commonly used as an effective metric for metric for evaluating failures of transportation systems, which observes the efficiency of network connectivity between the remaining stations by removing stations, i.e., it is designed to characterize the degree of network collapses [52, 53]. The metric is defined as

$$E = \frac{1}{N(N - 1)}\sum\limits_{i \ne j} {\frac{1}{{d_{ij} }}}$$
(16)

where \(d_{ij}\) denotes the shortest path length from station \({\text{i}}\) to station \(j\). \(E \in \left[ {0,1} \right]\), when the stations are not connected to each other, \(d_{ij} = \infty\) and the network connectivity \(E = 0\). We rank Beijing’s URTN according to the station importance indices, eliminate the 10% of stations with the largest index values at each time step \({\text{t}}\) (i.e., remove the top 10% of stations at each time step \({\text{t}}\)), and compute the remaining network connectivity scale until the remaining network is empty. The results of the attack are shown in Fig. 12. Under any indicators, the network connectivity first decreases approximately linearly as the number of failed nodes increases and then gradually slows down until it converges to zero. The network is attacked at an early stage, and the network scale is large. When some stations, especially those located at the hub location, are attacked and fail, they change the shortest path between multiple pairs of stations, thus reducing the accessibility of the network. When it comes to the late stage of the attack, as most of the stations in the network have failed, the network scale becomes reduced, forming a smaller connected subgraph, the influence on the local cluster stations becomes relatively small, and the network connectivity is basically contributed by these interconnected stations, leading to a flattening of the decline rate. This result indicates that the URTN exhibits a certain vulnerability in the face of deliberate attacks. Meanwhile, DC and BC generate the fastest decline in the early period, i.e., \(t < 2\), while the network crash rate increases rapidly under the weighted TOPSIS method afterwards at \(t \ge 2\), even surpassing DC and BC. As different measures identify stations with different importance rankings, the remaining network connectivity declines at different rates when different top-ranked stations are removed. We can observe that the DC, BC, and weighted TOPSIS curves decline much faster than the SC and TR curves. In addition, the faster rate of decline indicates that the removed stations are more likely to be critical hubs or transfer stations of the network, i.e., the stronger the measure recognition ability. DC and BC are used to identify the stations from the perspective of the topological characteristics of URTN structures, which have favorable evaluation results. For example, XZM station, ranked one in DC, is a three-line transfer station of Beijing Subway Lines 2, 4, and 13, and it is also an essential transportation hub and commercial center in Beijing. Consequently, quite similar to our expectations, the proposed weighted TOPSIS method appears to be a better indicator for identifying station importance in maintaining network connectivity.

Fig. 12
figure 12

The deliberate attack effectiveness of different indicators on the connectivity of Beijing’s URTN

4.6 Sensitivity Analysis

In order to more convincingly illustrate the performance of the proposed method, we integrate the measures by varying different weights for the sensitivity analysis. The most unique attribute of the UTRNs—TR—is used as the basis for increasing and decreasing the weight of this measure. Analysis 1 (increasing the weight of TR): the weights of DC, BC, SC, and TR are 0.2, 0.2, 0.2, and 0.4, respectively; Analysis 2 (decreasing the weight of TR): the weights of DC, BC, SC, and TR are 0.3, 0.3, 0.3, and 0.1, respectively. We present the results of the weighted TOPSIS scores of the stations in Beijing’s URTN with different weights, as shown in Fig. 13. The weighted TOPSIS scores of different stations for both results have a large difference, with a range of 0.0001–0.0149 for analysis 1, while simultaneously a range of 0.0000–0.0175 for analysis 2. In addition, comparing the results, we can observe that most of the stations have the same color, but some of them have changed as a result of the difference in the weights, which leads to a different identification of the importance of the stations. To further reveal the results of varying the weights in relation to these centrality measures, we analytically compare the Pearson correlation coefficients of these measures, as shown in Fig. 14. We can observe that the weighted TOPSIS is correlated with these basic centrality measures, containing their rich feature information. Furthermore, the findings are comparable to our expectations. Firstly, the correlation coefficients between the weighted TOPSIS (equal weights), analysis 1, and analysis 2 are quite high, indicating that the method has strong stability to incorporate these measures. This means that the method will not change significantly with some weight variations. The correlation coefficient between analysis 1 and TR (0.8649) is higher than the correlation coefficient between analysis 2 and TR (0.7282), while the correlation coefficient of the baseline weighted TOPSIS and TR (0.8039) is in the middle. This demonstrates that the critical stations identified in analysis 1 contain more information regarding the importance of the TR measure than those in analysis 2, which is essentially a higher weight of the TR in analysis 1. That is, the results in analysis 1 take more account of the influence of TR, whereas in analysis 2, more emphasis is placed on other centrality measures. Therefore, the proposed method can be extended as a reference to evaluate stations in different cities under different emphases to suit various policy demands that focus on different measures.

Fig. 13
figure 13

Weighted TOPSIS scores of the stations in Beijing’s URTN with different weights

Fig. 14
figure 14

Correlation coefficients of the results of weighted TOPSIS with different weights and centrality measures

5 Conclusions

This work is the first to incorporate TR into the metrics for identifying station importance in URTNs. In particular, the daily TR indicator of stations is proposed, and through the AFC data, the overall characteristics of passenger trips are revealed, while the distribution properties of different centrality measures of stations are explored. Furthermore, based on the weighted TOPSIS approach, we consider the centrality measures of integrated TR as multi-attributes of the network, and the importance of stations in URTNs is evaluated. To detect performance, we investigate different methods to estimate the importance of top-ranked stations in maintaining network connectivity under deliberate attacks. Finally, the validity and reliability of the model are verified by taking Beijing’s URTN in the morning peak hour as a case study and conducting sensitivity analysis.

The empirical results show that the designed method is typically a better indicator for identifying crucial stations in URTNs, which improves the rationality of the evaluation. This work also indicates that the URTN is vulnerable in the face of deliberate attacks on significant stations, especially stations with large values of DC, BC, and weighted TOPSIS. Therefore, strengthening the resilience and robustness of critical stations against attacks is crucial in order to optimize and improve the stability of URTNs. Furthermore, the weighted TOPSIS method used in this study can be modified for different transportation policies that focus on different measures. For example, the weights for different centrality measures can be varied according to the importance attached by policymakers in order to be more in line with the actual transport situation. Finally, the results of this study can provide guidance and recommendations for station protection strategies, such as enhancing the connections to other modes of public transit (e.g., buses) to reduce the SC value of stations, and improving the schedule quality and infrastructure configurations to reduce the TR value of stations. Government agencies should focus on maintaining and protecting the secure operation of critical stations, as their failure may lead to rapid network crashes. Moreover, it is also necessary to formulate different emergency response strategies for the different importance of individual measures at each station, and develop more reasonable emergency preparedness organizations for each station, so as to comprehensively improve the overall handling capacity of the emergency system of the entire metro. Meanwhile, the paper can also provide suggestions for the design and operation of URTNs. When introducing stations and lines, we should aim to connect the lower-ranked stations, and avoid connecting with the key stations in the network, such as the DWL, GM, and JTL stations. In addition, high TR stations should be managed and monitored in real time, with timely and effective notification to passengers and an appropriate increase in train frequency to minimize passenger congestion and improve service quality and operational efficiency.

However, some issues and more details still need to be studied in the future, such as train performance characteristics, the capacity limitations of terminal and yard facilities, and ridership demand. In addition, we plan to extend the model to multiple transit network modes in other cities, such as bus transit and road networks.