How Does High-Speed Rail Impact the Industry Structure? Evidence from China

The economic implications of high-speed rail (HSR) cannot be overlooked in China. This paper studies the impact of HSR on the advancement and rationalization of industrial structure and the tertiary industry aggregation through theoretical derivation and multi-period difference-in-differences (DID) by improving the theoretical framework and empirical methods according to the characteristics of China's HSR and economic development. From analyses on urban heterogeneity and inter-industry spillover effects, the transmission mechanism and expressions of the industrial structure are also discussed. The findings show that HSR promotes tertiary industry aggregation and contributes to the transformation of the industrial structure from the primary to secondary and tertiary industry sectors, as well as realizing industrial structure advancement but irrationalization. Furthermore, HSR has a more significant influence on tertiary industry aggregation in large cities and high-density cities. Additionally, the aggregation of the transportation, warehousing, and postal sectors has been reduced, with a significant spillover effect on neighboring cities, proving the siphon effect and conduction mechanism, resulting in a structural shift in the tertiary industry, from basic to advanced sectors. The movement of human resources is a key mediator in the economic impact of HSR.


Introduction
With a total length of 37,900 kilometers in 2020, highspeed railways (HSR) can be found in more than 80% of China's cities. Since the opening of high-speed rail in China in 2008, the economy of the regions along the line has developed rapidly. Represented by Beijing-Guangzhou and Beijing-shanghai, as shown in Fig. 1 and Fig. 2, provinces along the Beijing-Shanghai and Beijing-Guangzhou HSR routes overlapped with provinces with high GDP in 2021, with the tertiary industry accounting for more than half of GDP. Spiekerman (1994) found that the space-time distance is compressed after the opening of HSR [1]. However, the HSR's impact on the economy is not always positive. This might result in a siphon effect, suffocating the development of small and medium-sized cities [2]. The Beijing-Shanghai HSR, for example, establishes a 1-hour shuttle between Beijing, Tianjin, and Hebei, while simultaneously enabling Beijing to absorb Tianjin and Hebei's vast resources. Without a doubt, HSR's economic effect is significant, but how does HSR transform the industrial structure? What sectors will be affected as a result of this? Is HSR going to have a siphon effect? Theoretical and empirical studies are needed to better investigate these concerns.
With the advancement of modern transportation networks, more papers on the effect of transportation infrastructure on economic growth [3][4][5]. Since China's HSR launched late, related study on China has gradually gained significance until recent years. The summary of the literature on HSR' impact on economic growth is shown in Appendix A. In reality, China's HSR development and policy differ from those of other countries, and there is a knowledge gap that applies to China's HSR. Few analyses of China's HSR system take into consideration industry heterogeneity, which is the focus of research on industrial structure issues. Chandra and Thompson [6] found that certain industries profit from US interstates because of lower transportation costs, while others have migrated as a result of economic activity [6]. Holl [7] studied the effects of road investment and the aggregation effect on enterprises in 13 industrial and 9 service sectors in Portugal, and found significant differences across industries [7]. Western research material and methodologies about economic effects of HSR are numerous since it was the first to realize the industrial revolution [8][9][10]. Chen and Silva [11],  Ahlfeldt and Feddersen [12], Chen and Haynes [13], Guirao et al. [14], Guirao et al. [15], Cascetta et al. [16] studied the impact of HSR's opening in Spain, German, China, France and Italy, and the positive impact is acknowledged. The recent papers on the economic consequences of HSR concentrate on individual industries and production factors including regional labor mobility, economic development, and urbanization [17][18][19][20]. Zheng et al. [21] and Pan et al. [22] measured the spatial spillover effect of HSR stations in China, and found that HSR spur the agglomeration of various economic activities from being near and far from the station. About empirical methods, DID (Differences-in-Differences) model is used widely. Wang et al. [23] used anti-gravity and gravity bias models to study how the urban structure of the Yangtze River Delta evolved after the opening of the HSR. Liu et al. [24] used a time-varying DID model to study the spatial integration and industrial development of the Yangtze River Economic Belt, and explained the relationship between the spatial integration of urban aggregations and the characteristics of industrial development after the opening of the HSR. Liang et al. [25] and Wang et al. [28] used the PSM-DID model to analyze the economic development along the HSR line based on the panel data of prefecture-level cities. Li et al. [26] used synthetic control methods (SCM) to investigate the economic effect of HSR, and the results show that the economic effect of HSR had strong disparity [26]. Li et al. [27] used both the DID model and threshold regression and found that the opening of HSR had a significant threshold effect on improving the efficiency of the service sector [27]. Meng et al. [29] constructs an HSR operation network (HSRON) model to study the impact of network position (NP) on service industry agglomeration (SIA) by employing complex network analysis and panel regression methods [29]. Melo et al. [30] discovered that estimates of transportation infrastructure's productivity impact differed by major industry groups, with estimates for the US economy averaging higher than estimates for European countries, and estimates for highways averaging higher than estimates for other modes of transportation [30]. In the empirical analysis on HSR in China, there have been few papers focus on the quantification of transportation costs. It becomes clear that a well-established theoretical framework for analyzing the impact of HSR in China is desperately required. This paper is devoted to a more in-depth assessment of conditions based on previous research. The contribution of this paper is to improve the theoretical and empirical framework regarding the economic impact of HSR in China on the basis of previous papers. On the one hand, the mathematical economic geography theory of the impact of the opening of the HSR on the industrial structure was proposed, which comprehensively complemented the hypotheses on the economic effects of HSR. The empirical analysis, on the other hand, was in line with the reality in China. In contrast to other studies, we chose appropriate control and treatment groups to avoid cross-influence of lines within the area. In addition to verifying the hypotheses, the mechanism and heterogeneity of HSR on the industrial structure were explored, revealing the manifestations of economic transformation in China.
The remainder of this paper is structured as follows. Section 2 presents the theoretical framework. Section 3 describes the data set. Our empirical results are presented in Sect. 4. Our main conclusions are then summarized in Sect. 5.

Theoretical and Analytical Framework
The enhancement of accessibility is a direct result of the opening of HSR [10,31]. In addition to decreasing distances in time and space, the HSR indirectly increases the flow of factors such as labor, social capital, and technology, thereby reshaping the market structure [1,5]. Moreover, the opening of HSR facilitates the establishment of economies of scale, which further alters market competition patterns, as seen by the clustering of urban tertiary sectors and the expansion of cities not on the route [8,32]. The rationality of the industrial structure will be influenced by the advancement of the industrial structure [14]. As the cycle progresses, the advantageous industries will cluster, and the industrial structure will shift as well, both of which are manifestations of indirect impacts. The impact of the introduction of HSR on the industrial structure is seen in Fig. 3.
The theories and hypotheses proposed in this paper for the impact of HSR on the industrial structure are built using the core-periphery concept. 1 Area A and area B are the two economic zones. Assume there are no costs or restrictions to labor mobility in both places; consumers are rational, and they consume tradable product x and product y to maximize their utility with wages.
where 0\l\1\r, C Ax is the product x consumed in area A, C Ay is the product y consumed in area A. P Ax and P Ay are the corresponding prices, following the form of the D-S model. C Ax is expressed by constant elasticity of substitution (CES). W A is a consumer's income in area A. If consumer indifference curves are continuous, then: where p i is the price of the i-th product. After optimization, the indirect utility function is: where l is the payment's share for the i-th product. With the optimization conditions of suppliers, we define W B to be the income of consumers in area B, and s n to be the proportion of product x in area A among all product x (S n ¼ n i =n where n is the total of x). This paper introduces the theory of ''iceberg transportation cost'' proposed by Samuelson, which holds that there is a cost, represented by T, and only 1=Tð1=T\1Þ of products can be reached in the process of transporting products from area A to area B. Then, (4) can be transformed into the following: where r is the elasticity of substitution, and T is the product's transportation cost between area A and area B. Substituting (6) into (5), we get: where a ¼ l=ðr À 1Þ, similarly, we get: The ''accessibility'' between locations has improved after the opening of the HSR. Assume T ¼ e sÂtðHÞ , where H is a dummy variable for whether HSR is opened, s is time attenuation, and t is the average travel time from area A to area B after the opening of HSR. According to the location selection of the long-term equilibrium, the comparative utility function is as follows: Note ðe sÂtðHÞ Þ arÀa ¼ k. If we substitute the bivariable Taylor series expansion of s n 1Às n , W A W B and ln½1 À a k ð1 À k 2 Þ s n 1Às n into (9) and take the logarithm, we get: The consumer utilities in two areas are equal in the longterm equilibrium, that is: Note S n 1ÀS n ¼ S, then we get: oS oH is the impact of HSR on the industrial structure. Take the derivative of (12) with respect to H: It can be concluded that 0\ ok oH \1. oS oH can be determined by (13). While If oS oH [ 0, the behavior is ''aggregation,'' but if oS oH \0 then behaves as ''dispersion.'' Calculate the second derivative of (12) as follows: a ½ln WA WB þð1 À lÞln PAy PBy À À4kðk 2 À 1 À lnk À k 2 lnkÞ  Fig. 4a shows that the opening of the HSR can boost the number of industries, but the pace of growth will slow at the periphery.
(2) ½ln W A W B þ ð1 À lÞln P Ay P By 0. Fig. 4b shows o 2 S oH 2 \0 at the beginning, o 2 S oH 2 [ 0 at the end, and only one zero point. There are two possible situations. Within the domain, oS=oH [ 0 is identical. The HSR is always in an ''aggregating state'' when it is opened; the pace of aggregation, however, varies. (2) oS=oH [ 0 is observed. The aggregation effect of the HSR improves with time until it reaches its peak. The dispersion effect comes later, forcing suppliers with lower competitiveness to migrate to the outskirts. Finally, industries pool their resources to establish higher-quality tertiary industrial conglomerates.
The following hypotheses are proposed according to the derivation: Hypothesis I: Once the HSR opens, the number of tertiary industry businesses will increase and higher-grade industries will continue to aggregate. In the short term, it encourages tertiary industry aggregation in cities along the route, but in the long run, the aggregation is dependent on the initial circumstances.
Hypothesis II: Once the HSR opens, the industrial structure advances steadily, resulting in a structural shift in the tertiary industry, from basic to advanced sectors.

Objects
This paper chooses Beijing-Shanghai and Beijing-Guangzhou HSR as the objects (see Appendix B for details). On the one hand, they are being built at an early stage. The Beijing-Shanghai HSR is China's first dedicated long-distance passenger route, and it goes through some of China's most densely inhabited and economically developed districts. The Beijing-Guangzhou HSR is a watershed moment in China's HSR development and the world's longest HSR. Their construction, on the other hand, is rather quick, which helps to eliminate cross-effects caused by other factors. The Beijing-Shanghai and Beijing-Guangzhou HSR run through a total of 131 prefecture-level cities and municipalities. We omitted cities with other HSR lines passing through, prefecture-level cities that were demoted and divided during the sample period, such as Chaohu, cities with poor accessible data, such as Enshi and Shennongjia forest area, and so on in order to obtain unbiased estimates. The control group consists of 40 cities, with a total of 82 prefecture-level cities as samples.

Dependent Variable
The dependent variables are the location entropy of the tertiary industry (LQ third), the rationalization of Fig. 4 Forms of industrial aggregation and diffusion of HSR industrial structure (SR) and the advancement of industrial structure (SA). Location entropy is a commonly used index to measure the distribution of elements, and the location entropy of industry i in area j in the country ðLQ ij Þ is as follows: where q ij is the indicator of industry i in area j, q is the indicator of all industries in the country; q j is the indicator of all industries in area j; q i is the indicator of industry i in the country. The larger the LQ ij value, the higher the aggregation degree of the tertiary industry, implying that industry i has a competitive advantage in area j.
The advancement of the industrial structure (SA) is affected by multiple factors, and its definition is currently neither standard nor rigid. We generated an index to quantify SA using the molar structure change value calculation technique. A set of three-dimensional vectors X 0 ¼ ðx 1;0 ; x 2;0 ; x 3;0 Þ is constructed using the share of three industries in GDP as the spatial vector. The angles between X 0 and X 1 ¼ ð1; 0; 0Þ, X 0 and X 2 ¼ ð0; 1; 0Þ, X 0 and X 3 ¼ ð0; 0; 1Þ are, respectively, h 1 , h 2 and h 3 .
SA is thus defined as follows: The rationalization of the industrial structure ðSRÞ measures whether the proportion of industries is appropriate. The following is a regularly used measuring form for assigning weights to the significance of each of the three industries: where GDP is the city's gross product, GDP i =L i is labor productivity, GDP i =GDP is the output structure, and L i =L is the industry structure. Primary, secondary, and tertiary industries are represented by i ¼ 1; 2; 3. The industrial structure deviation is 0 when the output structure matches the employment structure, which is the most appropriate situation. It should be noted that the lower this ratio is, the more the industrial structure has been rationalized.

Independent Variable
The treatment group (treated c , set to 1) is cities along the HSR, whereas the control group (controlled c , set to 0) is cities outside the HSR. There is also a time dummy variable, which is 0 before the HSR's opening or 1 after the HSR's opening. The core independent variable HSR i is the interaction term of the product of the aforementioned two. The Beijing-Shanghai HSR was opened in 2011, with data from 2001 to 2010 as pre-opening data, and data from 2011 to 2017 as post-opening data. The Beijing-Wuhan segment of the Beijing-Guangzhou HSR opened in 2012, while the Wuhan-Guangzhou section opened in 2009. The Beijing-Guangzhou HSR is valued in the same manner. The particular expression is as follows: where Z i denotes the city-level fixed effects that do not change over time, X it are the time-varying control variables, and e it is the residual term. The difference induced by HSR is represented by b 1 , and the difference between cities is represented by b 2 .

Summary Statistics
Openness (lfdi), Economy (lgdp average), Wage (lwage average), Government interference (lfee), Human capital (lhuman capital), Informatization (ltele pop, Transportation (lroad average) and Infrastructure (lbooks average and lhospital num) are the control variables. The sample period is 2003-2017, and the data come from the China City Statistical Yearbook and the China Statistical Yearbook. We use the logarithm of the data throughout the paper and provide the summary descriptive statistics in Table 1. Obviously, with the opening of the HSR, the location entropy of the tertiary industry and the advancement and rationalization of industrial structure are greater, and more empirical study is necessary.

Parallel Trend Test
The parallel trend test, as described by Shao [17], is used to determine whether trends in the treatment and control groups are consistent: where t is the first HSR year of operation. The year before the opening is mðm ¼ 1; 2; 3Þ, while the year following is nðn ¼ 0; 1; 2; 3; 4; 5Þ. FirstHSR i;t , FirstHSR i;tÀm , FirstHSR i;tþn are dummy variables that return 1 if city i 's HSR is operational in the specified year. The initial 4 years ðyear ¼ À4; À3; À2; À1Þ and the latter 4 years ðyear ¼ 5; 6; 7; 8Þ x are both significant, as shown in Fig. 5. As a result, the opening of HSR has a large delayed impact on the aggregation of tertiary industries, and this conclusion is consistent with Shao [17]. And before year ¼ 4, b m increases year by year, which can be explained by the fact that the corporation made strategic planning after getting the news ahead of time [5,33]. It may be inferred that there is no significant difference in trend between the treatment and control groups before HSR's opening. By repeating the steps, it is confirmed that both industrial structure advancement and rationalization satisfy the hypothesis of a parallel trend.

Baseline Regression
The opening of HSR can be regarded as a quasi-natural experiment. 2 3 Multi-period DID is used as the identification model, in the form: where du is the entity fixed effect. dt is the time fixed effect. d 1 is the coefficient of double differences, which is also the value of interest in this paper. The coefficients of HSR remain significant and positive when control variables from models (1) to (6) are included in Table 2. Specifically, ceteris paribus, LQ third and SA increased by 0.029 and 0.021 respectively after HSR opened. This supports Hypothesis I and Hypothesis II of this paper, namely, that the HSR has a significant positive impact on tertiary industry aggregation and industrial structure advancement. The deviation from equilibrium in industrial structure is shown by the significant positive coefficient of SR. The findings also show that the opening of the HSR causes an imbalance in the local industrial structure and a lack of coordination in resource allocation. Wang (2019) studied the Yangtze River Delta region and discovered that HSR boosted the proportion of tertiary industry added value and improved the quality of urbanization by stimulating industrial structure upgrading [21]. Liang (2020) performed research on the Guangdong-Guangxi-Guizhou HSR and discovered that by altering the industrial structure of the area, the HSR may promote the development of undeveloped regions [25]. Wang (2019) and Liang (2020) came to similar conclusions as this paper's findings.

Changes in the Sample
The robustness test includes sample change, instrumental variables, and a placebo test. The changes in the sample come from adjusting the period scope [Model (1) to (3)] and excluding provincial capitals and municipalities [Model (4) to (6)]. The HSR coefficients remain significantly positive in Table 3, except for model (5) 4 . Model (5)'s insignificance might be due to endogenous recognition across cities, and a heterogeneous effect will be discussed later. As a result, the positive effect of HSR is consistent across samples, and the conclusion in this paper is robust.

Instrumental Variables (IV)
To test for endogeneity owing to omitted variables, instrumental variables (IV) was used in this paper. The IV of transportation infrastructure commonly mentioned in previous papers are landslides, geographic slope, ancient postal services and historical passenger traffic, historical planning information, and so on [34][35][36]. Referred to Dong Note: *, **, and *** indicate significant levels at 10%, 5%, and 1%, respectively [18], this paper uses China's historical railway network in 1961 as the IV of the opening of the HSR. The Beijing-Shanghai and Beijing-Guangzhou HSR construction is based on the railways constructed in 1961, and railways built in 1961 have no direct impact on the present industrial structure. Exogeneity and correlation are both satisfied using the railway network in 1961 as IV. The results in Table 4 show that the significant effect of HSR on SH is 0:0822, which is consistent with the baseline regression in Table 2. The correlation between instrumental and explanatory variables is also verified (old railway on HSR is 0:2198), and the assumption of weak instrumental variables was rejected (F À Statistic [ 10).

Placebo Test
To ensure that the results are not the result of chance or randomness, a placebo test is utilized, in which the treatment group is generated at random. The estimated coefficient of interaction in (22) is as follows: where z is the control variables. When c ¼ 0, the estimator b b 1 is unbiased. If treated c Â time t is replaced by other variables that do not affect explained variables (b 1 ¼ 0), and b b 1 ¼ 0 is obtained by estimation, then c ¼ 0 can be realized. Following this line of reasoning, we make the event of the opening of HSR random, so it has no effect on LQ third ct , SA ct and SR ct , i.e., b random ¼ 0. The distribution of b b random is obtained by repeating the above preceding technique as shown in Fig. 6, and t-statistics are distributed U-shaped, with peaks around zero.

Socioeconomic Characteristics Among Cities
The effect of HSR is varied across different regions due to differences in endowment [37][38][39]. This paper takes 3 Note: *, **, and *** indicate significant levels at 10%, 5%, and 1%, respectively million people and 0.5 million people per square kilometer as the classification for city size and population density, respectively. 5 Table 5 shows that the economic effects of metropolises and cities with low population density are statistically more significant than those of other cities, implying a link between socioeconomic characteristics and the impact of HSR on the industrial structure. This is to be expected, since megacities and cities with low population density have better market conditions and a greater effect on factor flows, making them more likely to achieve industrialization and structural upgrading.

Spillover and Aggregation Effects Across Industries
The spatial weight matrix is used to determine the spatial correlation by whether two economic units are geographically located adjacent to each other. With the modernization of transportation infrastructure, this paper uses the queen adjacency matrix to generate an 82*82 adjacency matrix (w ij ). The spatial matrix and Moran's I are expressed as follows: where x i is the value of unit i, x j is the value of unit j, x m is the average value of the grid cells in the area, n is the total number of cell grids, and W ij is the spatial weight matrix. Z (Moran's I) is used to test the significance of Moran's I, and the null hypothesis is that there is no spatial autocorrelation.
Moran's I was calculated for 18 subsectors except for agriculture, forestry, animal husbandry, and fishery, and the results and Industry Classification Standard are shown in. Despite the fact that the majority of the LQ X in the sample is not statistically significant at the 1% level each year, the spatial correlation is still worth researching given the delayed impact of the HSR opening seen in Fig. 5. 6 According to the results of the spatial autocorrelation test, this paper obtains the spatially relevant industries as follows: mining ðLQ 2 Þ, electricity, gas, and water production and supply ðLQ 4 Þ, transportation warehousing and postal ðLQ 7 Þ, accommodation and catering ðLQ 8 Þ, financial ðLQ 10 Þ, real estate ðLQ 11 Þ, scientific research and technical services ðLQ 13 Þ, education ðLQ 16 Þ, health and social security ðLQ 17 Þ and public management and social organizations ðLQ 19 Þ.
Spatial lag models (SAR), spatial error models (SEM), and spatial Durbin models (SDM) are examples of spatial econometric models. SAR and SEM assume spatial autocorrelation between dependent variables and error terms, respectively, whereas SDM considers both. To obtain SAR and SEM, the Kronecker product is employed to integrate the spatial matrix across time. The following are their expressions: Spatial lag model (SAR): Spatial error model (SEM): Spatial Durbin model (SDM): where b 0 is a constant, b is the matrix of the variable coefficients, X is the matrix of the independent variables, and W ij is the weight matrix. q is the spatial autoregressive coefficient, k is the spatial autocorrelation coefficient, and h is the spatial spillover effect. a i and c t are used to measure spatial fixed effects and time fixed effects, respectively. e it is the error term that is subject to normal distribution. This paper uses the approach described by Pace (2009) to separate the estimation of direct and indirect effects from SAR and SDM [40], and it takes the following form: Note: *, **, and *** indicate significant levels at 10%, 5%, and 1%, respectively 5 The reference is the scale classification standard of the State Council of China for large cities (type I) and small and medium-sized cities (type II), and the population density standard of the State Council of China for the construction of metro cities. 6 The economic effects of the high-speed rail's opening are numerous, but it is impossible to cover everything at this time. We correctly decrease the significance limit to 20% and delete Moran's I insignificant indications.
The partial derivatives with regard to the k th independent variable are as follows from area 1 to area N: where w ij is the element ði; jÞ of the matrix W ij . The direct effect is the average of the sum of the diagonal elements of the matrix (32). The indirect effect is the average of the sum of all row and column elements of the non-diagonal elements, which is also the spillover effect. The spatial panel models are estimated by maximum likelihood estimation (MLE) to avoid biased estimators. When LM-err is significant, the criterion for deciding the optimal spatial model is SEM, and when LM-lag is significant, SAR. The robustness of LM-err and LM-lag are compared if they are both substantial. The results are shown in the Tables 14 to  16 in Appendix D, indicating that SDM is the model with the best explanatory power. The aggregation of the transportation, warehousing, and postal sectors is significantly reduced as a consequence of the HSR's opening, as can be seen in Table 6, and they have a significant spillover impact on neighboring cities. Because HSR boosts urban housing costs, relatively low-end transportation and warehousing in the tertiary sector will be shifted to distant locations, with nearby cities being suitable places to accept them, leading to an increase in the industrial aggregation of these industries. This phenomenon is called the siphon  Note: *, **, and *** indicate significant levels at 10%, 5%, and 1%, respectively effect [2]. Furthermore, since the opening of HSR, there has been a structural shift in the tertiary industry, from basic to advanced sectors. By analyzing the heterogeneity of HSR in the industrial structure from the effects of HSR on the industry, HSR has an overall positive effect on most sectors, showing that the opening of HSR has fostered the aggregation of the tertiary industry as a whole. Debrezion (2007) [41], He (2020) [42], Huang (2020) [43], Shao (2017) [17], and Wang (2018) [44] studied the impact of HSR on real estate, automobiles, services, and finance, respectively, and found heterogeneity in the impact of HSR across industries, which is consistent with the findings in this paper.

Human Capital's Mediation Effect
Following the demonstration of the heterogeneity in the impact of HSR on various cities, it should be determined whether the impact is produced by the flow of human resources. In this paper, the number of college students is utilized as a variable to quantify human resources. If the coefficients of the three regressions are significant but c 0 is minor, then the human resource has an intermediary effect on the impact of HSR on industrial structure. Using the steps below: where X is HSR, Y are dependent variables, and M is the mediator variable. The significance of HSR's coefficients in Models (1), (3), and (5) in Table 7 demonstrates its explanatory power for Y and M. In Model (2), the partial intermediary effect of human resources is shown with a value of ab=c ¼ 0:0723 for LQ third as the explanatory variable. Because the lwrdxs coefficients in Model (4) are insignificant, the bootstrap test is required to obtain a distribution that is close to the population, and the results for SA are reported in Table 8. The sign of the coefficient in Table 8 is significantly positive, suggesting that human resources have only an indirect effect. Combining Tables 7  and 8, it can be concluded that human resources serve as a complete intermediary in the impact of HSR on the industrial structure.

Discussion and Conclusion
With China's fast expansion of HSR, how to accurately quantify the impact of HSR on the industrial structure is of great concern to many scholars. In light of China's current state of the economy, the theoretical framework and hypothesis of the impact of HSR on the industrial structure are derived using the core-periphery model. This demonstrates that the impact of HSR on industrial structure aggregation includes decreasing-speed and U-shaped-speed aggregation, while the impact of HSR on industrial structure rationalization is uncertain. A series of empirical studies are based on the three hypotheses given in this paper. The findings show that, first, HSR promotes tertiary industry aggregation and contributes to the transformation Note: *, **, and *** indicate significant levels at 10%, 5%, and 1%, respectively Note: *, **, and *** indicate significant levels at 10%, 5%, and 1%, respectively of the industrial structure from primary to the secondary and tertiary industry sector, as well as realizing the industrial structure advancement but not rationalization. Next, the impact of HSR on tertiary industry aggregation in major cities and high-density cities is greater than that in other cities, whereas the impact on the industrial structure advancement is smaller. Moreover, following the HSR's opening, the aggregation of the transportation, warehousing, and postal sectors has been greatly reduced, with a significant spillover effect on neighboring cities, proving the siphon effect and conduction mechanism of the HSR on industrial structure. There has also been a structural shift in the tertiary industry, from basic to advanced sectors. Finally, it has been confirmed that HSR decreases human resource flow costs and plays a partial intermediate function in the aggregation of tertiary industry, and the advanced industrial structure's intermediary role is clearer. The primary contribution of this paper is the selection of appropriate research objects. The impacts of newly built stations and the rehabilitation of existing stations will overlap if all HSR in the area is evaluated, and the crosseffects will lead to skewed results. This study determines the suitable control and treatment groups for improving recognition accuracy after extensive comparison. The second contribution is that this research creates a mathematical economic geography model of the influence of HSR opening on the industrial structure based on Nobel laureate in Economics Krugman's core-periphery model. Samuelson's iceberg transportation cost notion is introduced, which is supplemented to provide a reference for follow-up research. The third contribution is that, unlike previous research, this work takes into account the heterogeneity of HSR's economic impacts across sectors, examines the direct and indirect heterogeneity of HSR on various industries in the city, and investigates the ''siphon effect'' of HSR.
As urbanization and the establishment of HSR progress, China should continue to boost investment in HSR development and make active use of resource allocation tools to aid in the transformation of the industrial structure. Simultaneously, it should not pursue tertiary industry growth blindly, in order to prevent the establishment of the siphon effect. The shortcomings of this paper, on the one hand, the data used are prefecture-level, not county-level units, potentially resulting in inadequate precision. This work, on the other hand, uses the whole railway line as the research object to prevent cross-influence, potentially resulting in a self-selection dilemma, which could be solved by segmenting the line to form a control group.  Table 9.      Beijing-Guangzhou high-speed railway: Beijing (1) Beijing

Heteroscedasticity Test
Considering the possible influence of heteroscedasticity, the BP test was performed on the heteroscedasticity of the data. The basic principle of the BP test is to test whether there is heteroscedasticity through auxiliary regression, as shown in Fig. 7. The results show that the null hypothesis is accepted, indicating that there is no obvious heteroscedasticity effect in the data set, and further discussion can be continued. The verified industry-advanced SH and industrial structure rationalization SR have also passed the BP test.  Fig. 7 The results of the BP test

Multicollinearity Test
The variance inflation factor is a measure of the severity of multicollinearity in a multiple linear regression model, and it represents the ratio of the variance of the regression coefficient estimator compared with the variance assuming non-linear correlations among the independent variables. A rule of thumb for judging whether there is multicollinearity is that the VIF of the independent variable should not exceed 10, otherwise multicollinearity may become a problem. B.1 shows that the VIF of all variables does not exceed 10, that is, there is no obvious multicollinearity for the variables in this paper.

RESET Test
Considering the effect of possible omitted interactions of higher-order terms and other variables, a regression set error test (RESET) was performed. It is a general method for testing functional form in multiple regression models. Its rationale is a joint significance F-test of squared, cubic, and possibly higher powers of the fitted values derived from the original OLS estimate. Fig. 8 Accepting the null hypothesis means that the model does not obviously miss the need to include complex interaction terms between all independent variables and higher-order terms of the respective variables, and the next in-depth analysis can be carried out.

Appendix D: Division standard of industries and test results of spatial correlation
See Tables 13, 14, 15, and 16.