1 Introduction

Communications-based train control (CBTC) systems realize continuous and real-time two-way communication between train and ground through wireless communication technology, which can ensure the safety of train operation to the greatest extent and has become the preferred signal system in current urban rail transit [1]. The evolution of T2T communication technology, which is a part of vehicle-to-vehicle (V2V) communications, is closely related to the wide application of the CBTC system in rail transit. In the CBTC system based on T2T communication, most of the functions implemented by ground equipment are transferred to the onboard train subsystem [2], which reduces the complexity of the system. Thus the CBTC system is an important development direction for the next generation of train control.

Vehicle-to-everything (V2X) technology is a communication mode for the exchange of information between a car and the outside world, which mainly includes the communication between the car and another car, the pedestrian, the road infrastructure, and the cloud. Cellular Vehicle-to-Everything (C-V2X) is a communication technology based on the 3rd Generation Parternship Project (3GPP) global unified standard [3], including Long Term Evolution V2X (LTE-V2X), 5th Generation-V2X (5G-V2X) and subsequent iterations. Although the communication environment and performance requirements of road traffic and rail traffic are quite different, the communication protocols are similar. At present, the dedicated frequency band allocated to the Long Term Evolution for Metro (LTE-M) system of urban rail is 1785–1805 MHz [4]. When train-to-wayside communication and train-to-train communication coexist in the system, the same time frequency can be multiplexed between two links. How to make full use of radio resources to meet the quality of service (QoS) requirements of different services and improve the overall performance of wireless communication systems is a problem that must be solved.

Wang et al. [5] introduced the current status of C-V2X technology and the progress of communication architecture standards. Emara et al. [6] studied the end-to-end delay evaluation of C-V2X communication based on multiple access edge calculation, where its research background is the road vehicle scene. Chen et al. [7] proposed a C-V2X framework based on network slicing and spatial multiplexing, and allocated resources on demand for different types of users. Knoll et al. [8] put forward a C-V2X resource deployment architecture based on mobile network escort. None of these studies was aimed at urban rail transit application scenarios. At present, C-V2X vehicle-to-vehicle communication is based on D2D technology.

The research on radio resource management of D2D technology is mainly concentrated on the public network, including D2D communication mode selection and D2D communication interference coordination. In terms of mode selection, Azam et al. [9] proposed a communication mode selection mechanism to maximize the system throughput, using a linear approximation algorithm to solve the mixed integer optimization problem to obtain the suboptimal solution. Tang et al. [10] established a joint optimization problem of sub-channel allocation and transmit power in different D2D communication modes, and proposed a distributed two-stage method to separate mode selection and resource allocation, with low computational complexity. Hossain et al. [11] introduced a mode selection scheme based on game theory. The optimization goal is to reduce the transmit power of D2D users and improve energy efficiency. In terms of interference coordination, Oduola et al. [12] considered D2D users multiplexing cellular users’ downlink resources for communication, established a power control optimization problem, and proposed centralized and distributed D2D power allocation mechanisms to ensure the QoS indicators of D2D users. Lee et al. [13] designed a distributed power allocation mechanism based on random geometric Poisson distribution, with the goal of maximizing D2D user throughput, but this mechanism does not take into account the interference of cellular users and the system performance of the cellular network. A D2D communication cellular network spectrum allocation algorithm based on game theory was proposed by Song et al. [14]. This algorithm can effectively avoid the interference between cellular and D2D users and make full use of resources.

To sum up, most of the existing research on radio resource management of D2D technology does not consider the complexity of the scene and the difference of user requirements. Different from pursuing fairness between user equipments (UEs) proposed in Wang's work [15], we propose a radio resource management algorithm based on the Stackelberg game model for the specific application scenarios of urban rail transit, taking into account the priority of different train transmission services. The train-to-train communication responsible for the transmission of CBTC services is taken as the leader, and the train-to-wayside communication responsible for the transmission of Passenger Information System (PIS), CCTV, and other services is taken as the follower.

The rest of this paper is summarized as follows. The feasibility of implementing train-to-train communication is analyzed in Sect. 2. Section 3 proposes a radio resource management algorithm based on the Stackelberg game model, which effectively improves the reliability of train CBTC service under the coexistence of train-to-wayside and train-to-train communication modes. Simulation results and discussions are given in Sect. 4. Finally, we conclude the study in Sect. 5.

2 Feasibility Analysis of Train-to-Train Communication

In this section the feasibility of the train-to-train communication mode is analyzed from two aspects: the wireless signal coverage of the train as the transmitter and the train interval in the actual operation scene of the urban rail transit train.

2.1 Wireless Signal Coverage

In the CBTC system, the train antenna is located at the front or rear of the train, it interacts with ground equipment to realize two-way communication between the train and the ground. The LTE-M system equipment technical specification [16] points out that the train shall meet the requirements for train-to-wayside communication when the received signal-to-interference-plus-noise ratio (SINR) value is greater than or equal to −2 dB. We use this specification as a benchmark to simulate the quality of the wireless received signal SINR changes with the train-to-train communication distance based on the actual simulation parameters of the LTE-M wireless communication system by taking into account large-scale fading and small-scale fading models, as shown in Fig. 1.

Fig. 1
figure 1

The coverage of train-to-train communication

As can be seen from Fig. 1, with the increase of train-to-train communication distance, the quality of the wireless channel at the receiver continuously decreases. Taking the wireless quality of the receiver as −2 dB in reference [17], the coverage range of the onboard antenna transmission power is about 700 m.

2.2 Train Tracking Interval

In the rail transit transportation system, with the development of precise positioning technology and mobile communication technology for trains, moving block control has gradually replaced fixed block control and quasi-movable block control. The target braking point of tracking train is set as the rear of the front vehicle to dynamically control the train operation. Based on the train tracking distance in the absolute distance braking mode, this section regards the speed of the front train as 0, that is, the tail-place of the front train as the end point of mobile authorization, and demonstrates the feasibility of train-to-train communication between adjacent trains.

The train operation scene in the absolute distance braking mode is shown in Fig. 2. \(L_b\) is the service braking distance corresponding to the current running speed of the tracking train; \(L_s\) is the safety protection distance between the tracking train and the train ahead; \(L_{\min }\) is the minimum tracking distance of two trains in this mode.

Fig. 2
figure 2

Absolute braking control mode

According to the curve of most commonly used braking characteristics and emergency braking characteristics of trains given in [18], the relation between the value of deceleration corresponding to different speeds and the minimum tracking distance is shown in Table 1. With the increase of train speed, the minimum tracking distance will increase gradually.

Table 1 Braking characteristic parameters of common trains

It can be seen from Table 1 that under the commonly used train running speed of 80km/h, the minimum tracking distance is 478m, which is less than the coverage of train-to-train communication. Therefore, train-to-train communication is feasible in the urban rail transit environment.

3 Radio Resource Management Model and Algorithm

The optimization goal of radio resource management is given by taking into account the characteristics of train-to-train communication service in urban rail transit. The coexistence scenario of train-to-train communication and train-to-wayside communication is described. Based on this, the rationality of applying a Stackelberg game to deal with the radio resource management problem is demonstrated.

3.1 System Model

3.1.1 Optimization Objectives and Transmission Mode

Compared with CBTC service, passenger information system (PIS) and image monitoring system (IMS) services have less significance in urban rail transit systems, and the train-to-train communication should mainly bear train control related services, namely CBTC services. Therefore, the optimization index of train-to-train communication radio resource management should be based on the QoS requirements of CBTC services, that is, higher requirements for reliability but lower requirements for data rate. After the introduction of train-to-train communication, the urban rail transit wireless communication system has become more complicated, which puts forward higher requirements for radio resource management as well. How to improve the service quality of the train-to-train communication train carrying service based on service priority is the optimization goal of radio resource management in this paper.

D2D communication is divided into centralized control and distributed control. Compared with distributed control, centralized control not only can take advantage of D2D communication, but can also facilitate the management and control of resources. Therefore, in this scenario, the train-to-train communication mode adopts centralized control. When train-to-train communication and train-to-wayside communication coexist and the same time-frequency resources are multiplexed, the two communication links will interfere with each other. In the resource multiplexing mode, it can be divided into two situations: multiplexing train-to-wayside communication uplink resources and downlink resources [19]. Due to the limitation of onboard terminal power, the LTE system is an uplink limited system, so this paper takes the LTE uplink as the research object. In Fig. 3, the D2D link (Sidelink) between train 2 and train 3 multiplex the uplink Uu channel resources of train 1. Train 2 is used as the transmitter of the D2D link, and train 3 is used as the receiver. The interference situation in the system is divided into two parts: the Uu link of the base station is interfered by the transmission power of the Sidelink link of train 2, and the Sidelink is interfered by the running channel on the Uu link of the train-to-wayside communication of train 1. The next step is to study the multiplexing of D2D Sidelink link and establish a resource scheduling model with Fig. 3 as an application scenario.

Fig. 3
figure 3

The diagram of reusing the uplink channel of the train except the D2D-link

3.1.2 Optimization of Problem Transformation

Different from the public network that pursues fairness in resource allocation, the number of urban rail transit users is small and the priorities of different train services are different. Train-to-train communication is used to bear the CBTC service with the highest priority, while train-to-wayside communication is used to bear PIS and CCTV services with lower reliability requirements. Therefore, in the Stackelberg model game, the train-to-train communication link is the leader so as to encourage the establishment of the D2D link to improve the reliability of the CBTC business. Since the D2D link and the train-to-wayside communication link multiplex wireless resources, the train-to-wayside communication will cause interference to the train-to-train communication link, which may affect the transmission rate of the CBTC service. Therefore, the train-to-wayside communication is defined as a follower. On the basis of ensuring that the CBTC service is not affected, the throughput of video services such as PIS is increased as much as possible to maximize the link revenue.

According to the transmission mode mentioned above, the transmission mode of train-to-train communication is multiplexing the uplink channel of train ground communication link. Compared with the downlink, the spectrum resources used for uplink in the LTE-M cell are also divided into several channels N which are orthogonal to each other. The difference is that LTE-M cell uplink adopts single-carrier frequency-division multiple access (SC-FDMA) technology for uplinks, so the physical uplink shared channel (PUSCH) allocates physical resource block (PRB) resources to users continuously in order to support single carrier characteristics.

In this section, the concept of PRB is used for resource scheduling during analysis. Combined with the actual scenario, in order to avoid complex channel interference caused by channel multiplexing, it is assumed that the D2D link multiplexes the PRB of a single train-to-wayside communication link. Each PRB is occupied by one train-to-wayside communication link at most and one D2D link at a time. And it is assumed that the link is not interfered by other factors except for the interference caused by channel multiplexing and Gaussian white noise.

Based on the above analysis, it can be concluded that the wireless signal quality at the receiver of the D2D link d can be expressed as Eq. 1. We first define a binary variable \(x_{d,n}\). If the D2D link d occupies \(PRB_{n}\), then the value of \(x_{d,n}\) is 1; otherwise, the value of \(x_{d,n}\) is 0 [15].

$$\begin{aligned} SINR_{d}^{n}=\frac{p_{d}^{n}\left| h_{d}^{n}\right| ^{2}}{x_{d,n}p_{c}^{n}\left| f_{c,d}^{n}\right| ^{2}+N_{0}} \end{aligned}$$
(1)

In Eq. 1, \(p_{d}^{n}\) represents the transmit power on \(PRB_{n}\) at the transmitter of the D2D link, \(h_{d}^{n}\) is the channel gain of the D2D link, \(p_{c}^{n}\) represents the transmit power of the base station of the train-to-wayside link on \(PRB_{n}\), \(f_{c,d}^{n}\) represents the interference of the train-to-wayside communication base station to the train at the receiver of the D2D link, and \(N_{0}\) represents Gaussian white noise power. Similarly, the wireless signal quality of the base station corresponding to the train-to-wayside communication link at \(PRB_{n}\) can be expressed as Eq. 2.

$$\begin{aligned} SINR_{e}^{n}=\frac{p_{c}^{n}\left| h_{c}^{n}\right| ^{2}}{x_{d,n}p_{d}^{n}\left| f_{d,c}^{n}\right| ^{2}+N_{0}} \end{aligned}$$
(2)

where \(h_{c}^{n}\) represents the channel gain of the link between the train-to-wayside communication train and the base station. \(f_{d,c}^{n}\) represents the interference of the train at the transmitter of the D2D link to the base station. According to the previous analysis, the impact of train-to-train communication on the reliability of the train’s CBTC service depends on the transmission power of the D2D link transmitter when the quality of the D2D link wireless channel is certain. However, the transmitter of the D2D link will interfere with the traditional train-to-wayside communication link, thereby affecting the throughput of the train-to-wayside communication user service. Therefore, in this paper, the problem of resource allocation under the coexistence of train-to-train communication and train-to-wayside communication is transformed into the study of throughput of both parties.

3.2 Radio Resource Management Algorithm Based on Stackelberg Game

Since the D2D link and the train-to-wayside communication link multiplex radio resources while improving the spectrum utilization rate and CBTC service reliability, it will inevitably cause interference to the train-to-wayside communication link. In order to reduce the influence of interference on the two communication modes, the Stackelberg model game is introduced to coordinate resources [20]. Train-to-train communication bears the highest-priority in CBTC service and takes priority in resource occupation. Therefore, this paper defines the D2D link as the leader and the cellular user (CU) link as the follower. The leader owns channel resources. If the followers want to share channel resources, the leader will formulate a strategy to charge appropriate fees; followers will consider whether shared resources are beneficial according to the cost, and determine the transmission power to obtain revenue.

3.2.1 Utility Function of Leader D2D Link

For the D2D link, the train in the CU link causes interference to the train \({\mathrm {d}}^{{'}}\) at the receiver of the D2D link during uplink transmission. The train runs at high speed in the tunnel, and the channel quality of vehicle-to-vehicle communication is related to speed. The faster the speed, the larger the Doppler spread, and the more drastic the channel changes. In this model, the leader’s utility function is determined by service priority, train speed, CU link interference, and base station, which is defined as throughput plus CBTC service priority weight and gain from CU link, and minus the loss caused by CU link interference to receiver, as shown in Eq. 3.

$$\begin{aligned} u_{d}\left( \alpha ,p_{d}\right)&= \log _{2} \left( 1+\frac{p_{d}g_{d{d^{'}}}}{p_{c}g_{u,{d^{'}}}+N_{0}}\right) +\alpha p_{d}g_{d,c}\\&\quad +\frac{\beta }{v}-\gamma p_{c}g_{u,{d^{'}}} \end{aligned}$$
(3)

where \(g_{d{d^{'}}}=|{h_{d}}|^{2}\) and \(g_{u,{d^{'}}}=|f_{u,d^{'}}^{n}|^{2}\) represent the channel gain of the D2D link and the channel gain between the CU train and the train d at the receiver of the D2D link, respectively. \(p_{u}\) is the transmit power of the CU link train; \(p_{d}\) is the transmit power of train d at the transmitter of the D2D link; \(\alpha\) is the charge price (\(\alpha >0\)), which represents the fee charged to the D2D link because the D2D link reuses the uplink channel of the CU link for communication. \(\beta\) represents the priority of the CBTC service, \(\nu\) represents the train speed, and \(\gamma\) represents the priority of the video service.

The optimization of the follower utility function can be summarized as adjusting \(p_{d}\) so as to maximize \(u_{c}\left( \alpha ,p_{u}\right)\), as shown in formula 4. In addition, since the train transmission power \(p_{d}\) is limited by practical factors such as power limit and antenna gain, there is a constraint condition of \({p_{\min }}\le p_{d}\le p_{\max }\), shown in formula 5.

$$\begin{aligned}&\max _{p_{u}} u_{c}\left( \alpha ,p_{d}\right) \end{aligned}$$
(4)
$$\begin{aligned}&{\hbox{s.t.}\; p_{\min }}\le p_{d}\le p_{\max } \end{aligned}$$
(5)

3.2.2 Utility Function of the Follower CU Link

Since the D2D link multiplexes the uplink channel resources of the CU link, the train d at the transmitter of the D2D link causes interference to the base station c. Similar to the leader’s utility function, the follower’s utility function is defined as Eq. 6:

$$\begin{aligned} u_{c}\left( \alpha ,p_{d}\right)&= \log _{2} \left( 1+\frac{p_{u}g_{c}}{p_{d}g_{d,c}+N_{0}}\right) -\alpha p_{d}g_{d,c}\\&\quad +\frac{1}{\beta v}+\gamma p_{u}g_{c} \end{aligned}$$
(6)

where \(g_{c}\triangleq |{h_{c}}|^{2}\) and \(g_{{\mathrm {d}},c}\triangleq |{f_{d,c}}|^{2}\) respectively represent the channel gain of the link between the CU train and the base station, and the channel gain between the train and the base station at the transmitter of the D2D link. The optimization of the leader’s utility function can be summarized as adjusting \(\alpha\) to maximize \(u_{d}\left( \alpha ,p_{d}\right)\), as shown in formula 7.

$$\begin{aligned} \max _{\alpha } u_{d}\left( \alpha ,p_{d}\right) \end{aligned}$$
(7)

3.2.3 Optimize the Utility Function of the Leader D2D Link

In the Stackelberg model game, the follower chooses the optimal strategy based on the leader’s strategy fee \(\alpha\). At this time, the leader benefit function \(u_{d}\left( \alpha ,p_{d}\right)\) is a unary function of \(p_{d}\) and is differentiable. Taking the derivative of \(u_{d}\left( \alpha ,p_{d}\right)\), we can get:

$$\begin{aligned} \frac{\partial u_{d}\left( \alpha ,p_{d}\right) }{\partial p_{d}}=\frac{1}{\ln 2} \frac{g_{d{d^{'}}}}{p_{c}g_{u,{d^{'}}}+N_{0}+p_{d}g_{d{d^{'}}}}+\alpha g_{d,c} \end{aligned}$$
(8)

Suppose the above formula is equal to 0, and the only stationary point is:

$$\begin{aligned} p_{d}^{*}=-\frac{1}{\alpha g_{d,c}\ln 2}- \frac{p_{c}g_{u,{d^{'}}}+N_{0}}{g_{d{d^{'}}}} \end{aligned}$$
(9)

Continuing to analyze and solve the second derivative of \(u_{d}\left( \alpha ,p_{d}\right)\), it is always less than 0.

$$\begin{aligned} \frac{\partial ^{{\mathbf {2}}}u_{d}\left( \alpha ,p_{d}\right) }{\partial ^{2}p_{d}}=-\frac{1}{ \ln 2} \left( \frac{g_{d{d^{'}}}}{p_{c}g_{u,{d^{'}}}+N_{0}+p_{d}g_{d{d^{'}}}}\right) ^{2}<0 \end{aligned}$$
(10)

. The function \(f\left( p_{d}\right) =u_{d}\left( \alpha ,p_{d}\right)\) is continuous in the closed interval \(\left[ p_{\min },p_{\max }\right]\). According to the Maximum Theorem, the function \(f\left( p_{d}\right)\) has a maximum value. When \(p_{d}^{*}\in \left[ p_{\min },p_{\max }\right]\), the first derivative at \(p_{d}^{*}\) is equal to 0, and the second derivative is less than 0, so \(p_{d}^{*}\) is the only extreme point in the interval \(\left[ p_{\min },p_{\max }\right]\) and is the maximum point. Therefore, it can be proved that \(p_{d}^{*}\) is the maximum point of the function \(f\left( p_{d}\right)\) in \(\left[ p_{\min },p_{\max }\right]\) by contradiction, and the maximum value is \(f\left( p_{d}^{*}\right)\). When \(p_{d}^{*}\notin \left[ p_{\min },p_{\max }\right]\), the function \(f\left( p_{d}\right)\) has monotonicity in the interval \(\left[ p_{\min },p_{\max }\right]\), and the maximum point is obtained at the endpoint of the interval \(p_{d}=p_{\min }\) or \(p_{d}=p_{\max }\), and the maximum value is \(f\left( p_{\min }\right) \text { or }f\left( p_{\max }\right)\). In summary, the maximum point of the function \(f\left( p_{d}\right)\) belongs to \(\left\{ p_{\min }p_{\max }p_{d}^{*}\right\}\), that is, the leader’s optimal strategy \(p_{d}\in \left\{ p_{\min },p_{\max },p_{d}^{*}\right\}\).

3.2.4 Optimizing Utility Function of the Follower CU Link

From formula 9, whether \(p_{d}^{*}\) belongs to the interval \(\left[ p_{\min },p_{\max }\right]\) is related to the price \(\alpha\) determined by the leader. When \(\alpha\) is too large, \(p_{d}^{*}\le p_{\min }\), and the follower’s optimal strategy is \(p_{d}=p_{\min }\). It can be obtained from the leader’s utility function (4) that if \(p_{d}\) is too small, the income will decrease. When \(\alpha\) is too small, \(p_{d}^{*}\ge p_{\max }\), the optimal strategy for the follower at this time is \(p_{d}=p_{\max }\), which will increase the interference of the leader and reduce its profit. Therefore, when the leader makes a decision, it hopes \(p_{d}^{*}\in \left[ p_{\min },p_{\max }\right]\). According to formula (10), \(p_{d}^{*}\) and \(\alpha\) are negatively correlated, and \(\alpha\) must satisfy:

$$\begin{aligned} \alpha _{\min }&=\frac{g_{d{d^{'}}}}{\left( g_{d{d^{'}}}p_{\max }+p_{c}g_{u,{d^{'}}} +N_{0}\right) g_{d,c}\ln 2} \end{aligned}$$
(11)
$$\begin{aligned} \alpha _{\max }&=\frac{g_{d{d^{'}}}}{\left( g_{d{d^{'}}}p_{\min } +p_{c}g_{u,{d^{'}}}+N_{0}\right) g_{d,c}\ln 2} \end{aligned}$$
(12)

At this time, substituting the leader’s best strategy formula 9 into the follower’s utility function 6, we can get:

$$\begin{aligned} u_{c}\left( \alpha ,p_{d}\right)&=\log _{2} \left( 1 +\frac{p_{c}g_{c}}{N_{0}-\frac{1}{\alpha \ln 2} -\frac{\left( p_{c}g_{u,{d^{'}}}+N_{0}\right) g_{d,c}}{g_{d{d^{'}}}}}\right) \\&\quad +\frac{1}{\ln 2}+\alpha g_{d,c}\frac{p_{c}g_{u,{d^{'}}} +N_{0}}{g_{d{d^{'}}}}+\frac{1}{\beta v}+\gamma p_{u}g_{c} \end{aligned}$$
(13)

Observing the structure of the above formula, let \(A=\frac{1}{\ln 2}\), \({\mathrm {B}}=p_{c}g_{c}\), \({\mathrm {C}}=\frac{(p_{c}g_{u,{d^{'}}}+N_{0})g_{d,c}}{g_{d{d^{'}}}}\) and \({\mathrm {D}}=\frac{1}{\beta v}+\gamma p_{u}g_{c}\), and simplify the above formula for subsequent analysis. At this time, the follower’s utility function can be rewritten as:

$$\begin{aligned} u_{c}\left( \alpha ,p_{d}\right) =\log _{2} \left( 1+\frac{B\alpha }{N_{0}\alpha -A-C\alpha }\right) +A+\alpha C+D \end{aligned}$$
(14)

Derivation of the price \(\alpha\):

$$\begin{aligned}&\frac{\partial u_{c}\left( \alpha ,p_{d}\right) }{\partial \alpha }\nonumber \\&\quad =C+ \frac{BA^{2}}{\left[ \left( N_{0}\alpha -A-C\alpha \right) \left( N_{0}\alpha -A-C\alpha +B\alpha \right) \right] } \end{aligned}$$
(15)

Let the derivative function be equal to 0, and analyze the coefficients of the quadratic term to get the quadratic equations in one variable about \(\alpha\), as follows:

$$\begin{aligned} {\left\{ \begin{array}{l} \left[ -{\mathrm {C}}\left( N_{0}-{\mathrm {C}}\right) \left( {\mathrm {B}} -{\mathrm {C}}+N_{0}\right) \right] \alpha ^{2}+\left[ AC\left( 2N_{0}+B-2C\right) \right] \alpha \\ +A^{2}\left( B-C\right) =0,{\mathrm {B}}-{\mathrm {C}}+N_{0}\ne 0\\ \left[ AC\left( 2N_{0}+B-2C\right) \right] \alpha +A^{2}\left( B-C\right) =0,{\mathrm {B}}-{\mathrm {C}}+N_{0}=0\\ ABC\alpha +A^{2}\left( B-C\right) =0,N_{0}-{\mathrm {C}}=0 \end{array}\right. } \end{aligned}$$
(16)

By solving the above equation, the following equation can be obtained:

$$\begin{aligned} \alpha ^{*}={\left\{ \begin{array}{ll} \frac{A\left( 2N_{0}+B-2C\right) \pm \sqrt{\Delta }}{2\left( N_{0}-{\mathrm {C}} \right) \left( {\mathrm {B}}-{\mathrm {C}}+N_{0}\right) },&\quad {\mathrm {B}}-{\mathrm {C}}+N_{0}\ne 0\\ \frac{A}{C}-\frac{A}{B},&\quad{\mathrm {B}}-{\mathrm {C}}+N_{0}=0\\ \frac{A}{B}-\frac{A}{C},&\quad N_{0}-{\mathrm {C}}=0 \end{array}\right. } \end{aligned}$$
(17)

where \(C\ne 0\) and \(\Delta =4A^{2}BC\left( C^{2}-2CN_{0}+N_{0}^{2}+BN_{0}\right)\). According to the relationship between \(N_{0}-{\mathrm {C}}\) and \({\mathrm {B}}-{\mathrm {C}}+N_{0}\) and 0, the best strategy of the follower is obtained at \(\alpha =\alpha ^{*}\), \(\alpha =\alpha _{\min }\) and \(\alpha =\alpha _{\max }\), respectively. From this, it can be concluded that the follower’s optimal strategy is \(\alpha \in \left\{ \alpha ^{*},\alpha _{\min },\alpha _{\max }\right\}\).

So far, the analysis of radio resource management for the leaders (train-to-wayside communication) and the followers (train-to-train communication) based on the Stackelberg model game is completed under the coexistence of train-to-wayside communication and train-to-train communication modes in urban rail transit.

4 Simulation and Verification

4.1 Simulation Model and Parameters

The scenario including train-to-train communication is shown in Fig. 3, in which trains meet the minimum tracking interval requirements, and CU links are established with the base station. In addition, the train group that meets certain conditions will establish a D2D communication link according to the train-to-train communication process assisted by the base station. The simulation parameters including train-to-train communication are shown in Table 2.

Table 2 Simulation parameters

4.2 Simulation Results and Analysis

4.2.1 Change of Train-to-Train Communication Distance

  1. (1)

    Influence on the Strategies of Both Parties in the Game

Figure 4 describes the impact of train-to-train communication distance on the CU link strategy. It can be seen from the figure that as the train-to-train communication distance increases, the best price strategy for the CU link continues to decrease. The reason is that under the condition that the transmission power of the CU link train and the wireless channel gain are unchanged, as the channel quality of the D2D link continues to decrease, the revenue that the CU link obtains from the D2D link is also constantly decreasing. Therefore, in order to improve the reliability of the CBTC service and encourage D2D link communication, the price charged by the CU link is also continuously decreasing.

Fig. 4
figure 4

The optimal price strategy of CU link

Figure 5 describes the influence of train-to-train communication distance on the D2D link power strategy. As a follower, the D2D link provides the best transmit power strategy based on the price charged by the CU link. It can be seen from Fig. 5 that the optimal transmission strategy of the D2D link increases with the increase of the train-to-train communication distance, and can be divided into three segments \(\left\{ p_{\min },p_{\max },p_{d}^{*}\right\}\). When the train-to-train communication distance is too close, due to excessive interference to the CU link, the price charged by the CU link is very high. In order not to lose more on the D2D link, the best transmission strategy for the transmitter is \(p_{\min }\). When the train-to-train communication distance is moderate, and the price charged by the CU link makes the stagnation point of the D2D link utility function fall within \(\left[ p_{\min },p_{\max }\right]\), the optimal transmission strategy of the transmitter is \(p_{d}^{*}\), and the CU link and the D2D link reaches the game equilibrium. When the distance between trains is too big, although the CU link charge price is low due to the large loss of the D2D link, in order to improve its throughput the best transmission strategy of the transmitter is \(p_{\max }\).

Fig. 5
figure 5

The optimal transmitted power strategy of D2D link

  1. (2)

    Changes in throughput

In order to verify that train-to-train communication can improve the reliability of CBTC business, this paper simulates the changes in the throughput of CBTC and the throughput of the base station side of the CU link after the introduction of train-to-train communication, and compares it with the situation of traditional train-to-wayside communication, as shown in Fig. 6.

Fig. 6
figure 6

The comparison of throughput before and after introducing train-to-train communication

It can be seen from Fig. 6 that after the introduction of train-to-train communication, the throughput of train CBTC service has been significantly improved, from 0.1 Mbps in traditional train-to-wayside communication to about 0.8 Mbps. This is because the D2D link improves the wireless signal quality of the train at the receiver, thereby increases the throughput. In addition, the throughput of the train CBTC service that introduces train-to-train communication has undergone a process of declining, rising, and then decreasing with the increase of the train-to-train communication distance. This is because although the optimal power strategy of the D2D link continues to rise, the increase cannot offset the path loss caused by the increase in the train-to-train communication distance, so the throughput of the CBTC service has declined. When the train-to-train communication distance increases to 0.4 km, the charge from CU link to D2D link is 0. Figure 5 shows that the optimal power strategy of the train increases sharply at this time, reaching \(p_{\max }\), so the throughput of CBTC business begins to rise. As the train-to-train communication distance continues to increase, the path loss continues to increase, and the optimal train power strategy is limited by \(p_{\max }\), thus the throughput of CBTC services begins to decline.

As for the throughput of the base station, because the D2D link multiplexes the uplink channel of the CU link, the base station will inevitably be interfered, so the throughput of the base station on this uplink channel is reduced compared with the scenario of the traditional train-to-wayside communication mode. However, since the CU link only allows the D2D link to multiplex one sub-channel, the number of uplink channels available to the train is large. Taking 10 MHz bandwidth as an example, the number of physical layer radio resource blocks is 50. Therefore, on the whole, the throughput of the base station is basically unchanged.

4.2.2 Change of Train-to-Wayside Communication Distance

  1. (1)

    Influence on the Strategies of Both Parties in the Game

With the increase of the train-to-wayside communication distance, the uplink signal received by the base station side is continuously decreasing. When the quality of the wireless channel of the D2D link and the signal of the transmitter remain unchanged, the interference from the D2D link received by the base station side remains unchanged, so the CU link revenue decreases. The CU link hopes to reduce the interference of D2D link by charging a higher price, which is reflected in that the optimal price strategy increases with the increase of vehicle ground communication distance, as shown in Fig. 7.

Fig. 7
figure 7

The optimal price strategy of CU link

For the train at the receiver of D2D link, although the interference from the uplink signal from the CU link train is decreasing, as the CU link is the leader and the charging price is continuously increasing, D2D link will gradually reduce the transmitting power of its transmitter in order to pay as little as possible, as shown in Fig. 8.

  1. (2)

    Changes in throughput

This section simulates the influence process of the change of train-to-wayside communication distance on throughput before and after the introduction of train-to-train communication, and the results are shown in Fig. 9.

Fig. 8
figure 8

The optimal transmitted power strategy of D2D-link

Fig. 9
figure 9

The comparison of throughput before and after introducing train-to-train communication

As can be seen from Fig. 9, after the introduction of train-to-train communication, the throughput of the train CBTC service at the receiver of the D2D link has been significantly improved, and it increases with the train-to-wayside communication distance. This is because, although the power of the transmitter of the D2D link continues to decrease, the interference of the CU link train to the train on the receiver of the D2D link decreases faster. On the other hand, the throughput of the base station side decreases with the train-to-wayside communication distance, because the path loss increases with the distance when the transmission power of the CU link train remains unchanged. At the same time, after the introduction of the train-to-train communication mode, the base station suffers interference on the shared channel, and its throughput is less than that of the base station under the traditional train-to-wayside communication mode.

5 Conclusion

The radio resource management of urban rail transit communication systems under the coexistence of train-to-train communication and train-to-wayside communication modes is studied in this paper, and a resource scheduling scheme based on Stackelberg model games is proposed and verified. The simulation and analysis results show that the proposed algorithm can improve the reliability of train CBTC service without affecting the train-to-wayside communication, and provide a reference for the radio resource management of the next-generation train control systems in the future.