Introduction

In the era of the knowledge economy, the value of scientific knowledge far surpasses any previous era. As the environment, health, energy, and population issues become increasingly complex, the information entropy of research objects is showing exponential growth. Against this backdrop, Large-scale research infrastructures (LSRIs) serve as important instruments for exploring the forefront of science and technology (S&T) and providing social public value. Their role in social and economic development is becoming more prominent (Michalowski, 2014; Beck and Charitos, 2021). LSRIs are regarded as large scientific platforms or systems consisting of clusters of large scientific instruments, facilities, and equipment (Michalowski, 2014; Qiao et al., 2016; D’ippolito and Rüling, 2019). The construction and operation level of LSRIs represents the strength of a country or region’s core original innovation ability (Marcelli, 2014). Therefore, LSRIs are particularly important for emerging countries that hope to catch up with developed countries in the field of S&T. Despite their significant demonstration and radiation effects, LSRIs have long been a topic of controversy due to their high technological complexity, long development cycles, and huge investment (Jiang et al., 2018; D’ippolito and Rüling, 2019). During the construction and operation stages, LSRIs usually face new and variable challenges involving multiple disciplines. The high complexity and uncertainty make failure easy and cause huge economic losses (Beck and Charitos, 2021). Therefore, it is particularly important to scientifically evaluate the knowledge effect produced by LSRIs. Existing research has explored the definition (Michalowski, 2014; Qiao et al., 2016; D’ippolito and Rüling, 2019), type (Qiao et al., 2016), and distribution (Marcelli, 2014) of LSRIs, analyzed the scientific effect that LSRIs possess in theory (Michalowski, 2014; Qiao et al., 2016) and investigated specific infrastructure using scientometrics or case study methods (Lozano et al., 2014; Carrazza et al., 2016; Caliari et al., 2020). However, as some studies have pointed out, there are few systematic evaluating the effect of LSRIs based on a causal inference framework (Bollen et al., 2011), and further efforts are needed to identify the role of LSRIs in innovation growth at the regional level (Caliari et al., 2020; Beck and Charitos, 2021).

In modern scientific research and technology engineering, complex mathematical calculations beyond human cognitive abilities are frequently encountered and must be solved using computers (Bollen et al., 2011; LeDuc et al., 2014). High-performance computing applications have integrated modeling, algorithms, software development, and computational simulation, serving as a necessary link for the application of high-performance computers in cutting-edge basic scientific research and becoming a third scientific method apart from theoretical research and scientific experiments. This study focuses on the impact and mechanisms of China’s National Supercomputing Center (NSC) on knowledge innovation (KI). The reasons for choosing NSC in China as the research object are: on the one hand, China has attached great importance to large-scale scientific facilities and their associated research in recent years, particularly in the field of supercomputing. The national and local governments have cumulatively invested billions of dollars and constructed more than ten national-level supercomputing centers. Among them, the Tianhe-1, Tianhe-2, and Sunway TaihuLight supercomputers are representative examples that have consistently ranked among the top ten on the TOP500 list, making China one of the few developing countries to achieve this level in large-scale scientific infrastructure investment. Investigating whether LSRIs investment enhances computational capabilities and contributes to scientific productivity or is merely an ambitious “image project” (public image campaign) is of great significance. On the other hand, since 2009, China has established NSCs in some cities (not traditional first-tier cities such as Beijing and Shanghai), which can be viewed as an external shock for local development. This provides a prerequisite for evaluating the scientific effect of NSCs under a causal inference framework, especially given that the goals and application fields of NSCs are rooted in local innovation endowment and industrial foundation (as a place-based policy). Taking Tianjin (one of the four municipalities directly under the central government of China) as an example, the construction of NSC has promoted the establishment of Tianhe S&T Park and the Industrial Big Data Application Innovation Center, aiming to build an industrial innovation system that integrates industry, academia, and research, promoting local talent cultivation and international cooperation.

Considering the unique attributes of NSC, this study’s research topic and objectives not only aim to address the ongoing debate regarding the role of LSRIs but also encompass the following two aspects:

  1. (1)

    As an extension of digital infrastructure.

    Influenced by Schumpeterian innovation theory, introducing new technology and utilizing the power of “creative destruction” to enhance production levels are regarded as crucial factors for regional economic growth (Cardona et al., 2013; Batabyal and Nijkamp, 2016). The emergence and widespread use of information and communication technology (ICT) have fostered the digital economy. With the development of 5G communication, big data, and artificial intelligence, digital technology is increasingly viewed as a radical new technology. Existing literature confirms the broader effects of digital infrastructure construction on economic growth, urban innovation, corporate transformation, and social development (Cardona et al., 2013; Balcerzak and Bernard, 2017; Zhou et al., 2021; Zhang et al., 2022; Tang and Zhao, 2023). The majority of this literature are focused on scrutinizing the effects of network infrastructure, while relatively less attention has been given to the role of computing infrastructure, specifically its impact on promoting scientific knowledge production. This study seeks to provide evidence of the NSC’s influence on regional knowledge innovation as computing infrastructure.

  2. (2)

    As a practice of place-based innovation policy.

The scale effect of agglomeration leads to an increase in research and development (R&D) factor demand and releases the self-reinforcing characteristics of innovation, which may hinder the catching up of backward areas with advanced areas and have a negative impact on overall regional competitiveness and inclusiveness. Place-based innovation policy plays a crucial role in promoting coordinated regional innovation with the aim of achieving innovation convergence (Barca et al., 2012; Liu and Li, 2021). Although this policy model follows the principle of differentiation, some scholars are cautious about intervention, believing that government intervention may distort resource allocation, resulting in a loss of innovation efficiency, or point out that the impact of local policies is limited (Neumark and Simpson, 2015; Lu et al., 2022). As Marcelli (2014) mentioned in his study, scientific infrastructure is often situated in specific geographic locations, as evidenced by the establishment of multiple NSCs in different cities. Therefore, this study can also be seen as an evaluation of place-based innovation policy, involving identifying how supercomputing centers contribute to the development of local and regional knowledge innovation.

Compared with existing literature, this study aims to make several theoretical contributions:

First, drawing on the resource-based view, we categorize urban innovation resources into tangible resources such as human, financial, and physical capital, and intangible resources including social capital and resource utilization efficiency. This process not only shifts the focus from enterprise strategic resources to regional innovation resources but also integrates the resource-based view with social network theory and innovation efficiency research. Second, we establish a link between the scientific effect of LSRIs and the resource-based view, mapping the four scientific effect dimensions of S&T advancement effect, capability cultivation effect, networking effect, and clustering effect to innovation resources (both tangible and intangible). By extending the evaluation of LSRIs to the regional level, we provide empirical evidence for the causal relationship between LSRI and their innovation performance. Third, we classify the mechanism of NSC’s impact on regional knowledge innovation into three representative effects: the basic effect represented by R&D expenditure, S&T human resources, and digital infrastructure; the network effect represented by urban innovation network centrality; and the technological effect represented by innovation efficiency. By utilizing the convergence model, we verify the policy spillover of LSRIs and elucidate the role of computing infrastructure construction as a place-based innovation policy in regional innovation.

This article first discusses the definition and scientific effect of LSRIs, and based on the resource-based view, constructs a conceptual framework of LSRIs’ impact on knowledge innovation through mapping the scientific effect to different innovation resources. Next, we propose three mechanisms of NSC that affect KI (basic effect, network effect, and technology effect) and briefly review the development history of NSC in China. Then, the data, methods, and estimation results are presented. Finally, we discuss, and summarize the research results, and suggest policy implications.

Theoretical basis and evaluation framework

Scientific effect of large research infrastructures

LSRIs, which are scientific research facilities built to meet the needs of modern “big science” research, aim to expand human cognitive abilities, discover new laws, and incubate new technologies. Some studies have divided the roles of LSRIs into categories including, but not limited to, scientific, technological, economic, educational, and other social aspects (Marcelli, 2014; Michalowski, 2014; Qiao et al., 2016; Carrazza et al., 2016; Caliari et al., 2020; Beck and Charitos, 2021). The OECD report in 2014 partitioned the impacts of LSRIs into scientific achievements, impacts of construction and operation, personnel training, scientific cooperation, technological innovation, and education (Michalowski, 2014). Qiao et al. (2016) established an analytical framework to evaluate the implementation effects of LSRIs, deconstructing the scientific effect of LSRIs from the perspectives of the S&T advancement effect, capability cultivation effect, networking effect, and clustering effect. Caliari et al. (2020) considered that LSRIs can make significant contributions to the economic growth of developing countries through technology and innovation, with specific roles involving scientific output and technological progress, supporting the development of industrial, health, and agricultural sectors. some scholars have explored the specific impacts of LSRIs in a targeted manner, such as D’ippolito and Rüling (2019) who discussed the types and formation of cooperation and their impact under the background of LSRIs sharing. Scarrà and Piccaluga (2022), aiming to understand how big science affects innovation through transfer mechanism and spillover effect, reviewed the relevant research directions through literature surveys, covering six major themes including technology transfer methods and mechanisms, cooperation with the public sector, and spillover effects of LSRIs, etc.

This article aims to examine the impact of NSC, a type of LSRI, on regional knowledge innovation and its mechanism. Establishing a framework is a prerequisite for conducting the evaluation. Given that the area of the research sample is China, in order to better fit the institutional and developmental background, we build the framework based on Michalowski (2014) and Qiao et al. (2016) and integrate the resource-based view to construct the path of NSC’s impact on knowledge innovation.

Resource-based view, social network theory, and innovation efficiency

The resource-based view emphasizes that an organization’s success is rooted in its specific resources, which constitute the logical starting point for strategic decision-making. The impact of resources on an organization’s competitive advantage applies not only to the enterprise level but also to the competitiveness of regions and countries, which depend on their resource endowments (Porter, 1990; Fatima et al., 2022; Ge and Liu, 2022). The study of the firm has identified specific forms of resources, with Grant (1991) proposing six major resources, including financial resources, physical resources, human resources, technological resources, reputation resources, and organizational resources. Das and Teng (1998) divided resources into financial, managerial, material, and technological categories. These heterogeneous resources can be classified into different groups based on different criteria, such as tangibility or whether they are protected by property law. In the context of innovation, Del Canto and Gonzalez (1999) categorized R&D resources into three types: financial, physical (capital intensity), and human resources. Auranen and Nieminen (2010) argued that organizations ensure the continuous development of R&D activities by acquiring and possessing equipment, funding, and personnel. Of course, the development of urban knowledge innovation not only depends on the direct input of local R&D resources but also the interaction with other regions. Social network theory asserts that the manner in which events unfold is contingent upon the context in which they take place. From the perspective of social capital, networks have significant value in transmitting resources (Beck and Charitos, 2021; Wei et al., 2022). Thus, relationships established through interaction and the networks formed through accumulation become important channels for information and knowledge diffusion. Similar views also appear in studies of the knowledge-based view (KBV), with the practicality of knowledge determining the need for interaction with external groups, and regions can effectively supplement their local knowledge resources through external knowledge spillover channels (Das and Teng,1998; Ge and Liu, 2022). Overall, the RBV regards the creation and maintenance of networks as a mechanism for acquiring scarce resources, and the degree of embedding of regions in knowledge innovation networks reflects their implicit social capital resources.

However, in many instances, an organization’s success is not determined by its possession of superior resources, but rather by its ability to effectively utilize them. Simply possessing specific resources does not guarantee an organization’s competitive advantage, thus rendering resource utilization a critical issue in resource-based theory research (Majumdar, 1998; Arbelo et al., 2021). The high uncertainty of R&D activities and the limited quantity of R&D resources make it insufficient to merely explore resource input. This issue is relevant to the use of resources at the micro level, like in businesses, universities, and laboratories, as well as at the macro level, including in cities, regions, and even countries. It involves how to optimize resource input efficiency, such as using fewer resources to support the same level of business (output) or using existing resources to support more business (output). Relevant literature in the field of innovation suggests that knowledge production efficiency or innovation efficiency can be understood as the level of innovation potential formed by different R&D resources, i.e., the degree to which innovation input is converted into actual innovation output (Bai et al., 2020). The efficiency level is often related to the institutional background, organizational model, and internal structure of the research subject (Li, 2009). For a city, the innovation efficiency of a region is influenced by the internal innovation organization and element structure, as well as the innovation environment.

Based on the above discussion, we consider that the RBV provides a conceptual framework for examining the impact of NSC on knowledge innovation. This view is aligned with numerous dimensions of scientific effect in LSRIs. At the regional level, innovation factors, including R&D funding, and human and material capital, are regarded as the basic components of inter-regional innovation capacity, or as the core inputs for knowledge production. In this study, we take these factors as innovation resources unique to the local area and possessing tangible characteristics. This can be mapped to the capability cultivation effect (talent cultivation) and clustering effect (innovation agglomeration) in scientific effect. Furthermore, considering that the cross-regional networks formed by the interaction of cities with other regions constitute one of the main channels for information exchange and knowledge spillover, we view the embedding of cities in the innovation network as one of the main ways to obtain external resources (networking effect in LSRIs scientific effect), or as understanding the social capital resources of cities possessing intangible characteristics. Finally, given that the utilization of resources (affected by technological progress, institutional factors, and agglomeration, mapping to the capability cultivation effect and clustering effect) in increasing regional competitive advantages is as important as resource acquisition, we consider it as another intangible resource besides social capital at the regional level. In this way, we have achieved an integration of the resource-based view, social network theory, and innovation efficiency, forming a conceptual framework for NSC to influence knowledge innovation by changing regional innovation resources, which is linked to existing dimensions of scientific effect (Fig. 1).

Fig. 1: Mapping association and conceptual framework.
figure 1

The mapping of resource possession and utilization, along with the various dimensions of scientific effects within LSRIs, has achieved the integration of the resource-based view, social network theory, and innovation efficiency. The conceptual framework, wherein NSC influences KI, is thereby constructed.

Research hypothesis

Basic effect

Similar to other infrastructures, the construction of NSC also has a knowledge spillover effect (across technological fields). NSC not only provides high-performance computing services but also has a complete application software environment. With accumulated research achievements and industry big data, the center can achieve the integration of supercomputing, big data, and artificial intelligence through R&D, construct a supercomputing application network, provide resources and platforms for digital services, and foster emerging industries in the supercomputing field. The NSC in Tianjin includes the supercomputing center, cloud computing center, e-government center, big data, and artificial intelligence R&D environment, aiming to promote the rapid development of the digital industry in the local and surrounding areas. It should be emphasized that the digital infrastructure requires financial support from the government, which often provides funding for R&D activities conducted by both public and private institutions (Gao and Yuan, 2020). The social benefits of R&D activities cannot be fully internalized by market mechanisms, making the government’s fiscal intervention somewhat reasonable. The construction of LSRIs represented by NSC requires a large investment and involves high risks, and there is a lack of sufficient motivation for private capital involvement. Given that the establishment of NSC relies heavily on fiscal investment as a key driver, the increase in government fiscal expenditures is needed to provide support for R&D and operation. Furthermore, the government may raise its S&T expenditures by increasing the number of project applications, which could provide financial support for universities, research institutions, and enterprises to purchase computing power, thereby facilitating efficient scientific research. Finally, talent is the key to influencing a city’s innovation and learning abilities. On the one hand, the construction and operation of the NSC require professionals in high-performance computing, computer networks, parallel software, and distributed systems, who should possess relevant industry experience and professional knowledge. On the other hand, investment in the new digital technologies and knowledge spillover from the NSC will drive the development of digital and other emerging industries. These high value-added and knowledge-intensive industries will in turn attract more S&T talents and enhance the regional innovation competitiveness. Based on the above, this study hypothesizes:

H1. NSC can promote the development of knowledge innovation by influencing regional financial resources (fiscal S&T expenditure), human resources (S&T talents), and material resources (digital infrastructure). We also define this as the basic effect of NSC on KI.

Network effect

Digital infrastructure characterized by informatization and networking can overcome the barriers of temporal and spatial distance in scientific research activities. It not only connects innovative fields that were previously isolated, promoting knowledge convergence and recombination, but also facilitates long-distance knowledge dissemination that would otherwise be constrained by geographic limitations (Qiao et al., 2016; da Silva Neto and Chiarini, 2023). To acquire high-end digital technology services, other cities are often more willing to establish cooperative relationships with NSC cities. Such cooperation can not only enhance the strength of existing collaborative interactions but also potentially form new cooperative relationships. The embedding of the innovation network structure can increase the centrality of the city in the network, which contributes to the attainment of more information and resource benefits (Han et al., 2021; Wen et al., 2021). Cross-regional cooperation connects innovation organizations with heterogeneous knowledge, which can alleviate the innovation reduction caused by homogeneous knowledge at the local level (Hazır et al., 2018). Existing research has revealed the negative impacts of excessive centrality. The establishment and maintenance of social relationships incur a certain cost, while excessive embedding of network structures may lead to increased maintenance costs, potentially crowding out innovation resources and generating diseconomies of scale (Wang et al., 2014). Complex connections imply exposure to more information, which poses challenges to information screening, and integration, and even leads to information overload. NSC relies on the supercomputer consisting of thousands of processors and extends the development of new digital technologies such as big data, artificial intelligence, and cloud computing, enabling the storage and recognition of massive amounts of information and knowledge. This reduces the cost of network maintenance, and cities can improve efficiency in the processes of capturing external information, absorbing knowledge, and maintaining external relationships, thereby enhancing the positive impact of urban research network embedding and encouraging cities to be involved in scientific research cooperation networks actively. Empirical evidence shows that the NSC in Chengdu has offered computing services to over 760 users across 35 cities, including major metropolitan areas such as Beijing, Shanghai, Guangzhou, and Chongqing. The NSC in Tianjin provides computing services to over 30 provinces, municipalities, and autonomous regions across the country, with more than ten partner institutions, including universities like Peking University, Dalian University of Technology, Jilin University, and Harbin Engineering University, as well as local government such as Linyi City. Additionally, it has established joint laboratories with 17 institutions, to support basic research and technological innovation. Based on the above analysis, it is reasonable to propose the hypothesis:

H2. The construction and operation of NSC can promote the development of knowledge innovation by affecting the embedding of the region in the national scientific research network (i.e., social capital resources), which is also defined in this study as the network effect of NSC’s impact on KI.

Technology effect

The impact of NSC on the utilization capacity of regional S&T resources is mainly reflected in two aspects: (1) computing efficiency. The emergence of new data features has brought about governance challenges such as how to handle, store, transmit, and analyze data, while also driving a paradigm shift in scientific research. In fields such as drug testing, genomics research, climate simulation, energy exploration, molecular modeling, and astrophysics, high-dimensional and massive data impose higher demands on computing power and memory. The main features of supercomputers include two aspects: fast data processing speed and large data storage capacity. For the former, the computing speed of supercomputers can currently reach more than hundreds of billions of times per second in China, which is millions of times faster than ordinary computers. The peak computing speed of the “Tianhe-2” system is 100.7 PFlops, and the sustained computing speed is 61.4 PFlops; for the Sunway TaihuLight supercomputer, these two data are 125.4 Pflops and 93.1 Pflops, respectively. As for the latter, the total storage capacity has also reached dozens of PetaBytes. The supercomputing center is equipped with various peripheral devices and high-functional software systems, which will greatly shorten the cycle of innovation, and reduce the cost and uncertainty of innovation. (2) Allocation efficiency. The limited ability of the public sector to acquire information and knowledge, along with the potential for policy lag, may result in interventions that fail to produce the expected effects. In the context of NSC construction, supporting the entire process of data collection, sharing, computation, analysis, and application with computing power can significantly improve the level of intelligence, precision, and scientific decision-making in social governance. Governments can use digital technology tools to better plan and formulate S&T policies. Through multi-disciplinary and cross-departmental information sharing, supercomputing centers can leverage new technologies such as cloud computing, big data, and the Internet of Things to optimize the integration and allocation of scientific and technological resources and achieve scientific decision-making for S&T expenditures. By helping to build digital platform for urban S&T resources, the correlation mapping of different data such as projects, expenditures, and results can be achieved, providing support for project progress, result evaluation, and further funding decisions. NSC and its supporting digital infrastructure construction can significantly improve the efficiency of data collection, processing, transmission, storage, and other aspects on both the supply and demand sides. The cloud platform is configured with multiple databases such as scientific research results and talents, achieving “dual-linkage” between scientific research results and innovation needs. This model mitigates the temporal and spatial limitations of the interaction between supply and demand for R&D elements and reduces the cost of search, effectively improving the city’s ability to integrate S&T resources. Based on the above analysis, hypotheses are put forward:

H3. The construction and operation of NSC can promote the development of knowledge innovation through the improvement of computing and allocation efficiency. We also define this as the technology effect of NSC affecting KI.

Innovation convergence and knowledge diffusion

The importance of knowledge in modern economic development is increasing, and regional innovation is becoming increasingly reliant on close spatial associations. In other words, the growth of innovation in a region depends not only on the local accumulation of knowledge but also on the diffusion of knowledge from neighboring areas, which is one of the potential opportunities for promoting regional innovation development (Tang and Cui, 2023). Existing literature indicate that the construction of digital infrastructure can help spread and diffuse knowledge (Batabyal and Nijkamp, 2016; Balcerzak and Bernard, 2017), while S&T centers can promote regional collaborative development through knowledge spillovers generated by innovation clusters (Gao and Yuan, 2020). Therefore, the construction of NSC, which combines digital infrastructure and scientific infrastructure, may also have an impact on the knowledge innovation development of neighboring areas. Firstly, this type of proximity relationship can be general geographic adjacency, as the flow of R&D factors and the effects of knowledge learning (especially tacit knowledge) are still influenced by geographical distance. The establishment of NSC in Kunshan aims to undertake advanced computing and scientific big data processing business in the Yangtze River Delta region, and engage in strategic cooperation with Suzhou Deep-time Digital Earth Research Center, Shanghai Neuroscience Research Center, and other institutions, to carry out applied computing research and services in scientific fields including artificial intelligence and biomedicine. Secondly, proximity can also refer to the economic distance, which in the context of this study can be understood as differences in the level of urban digitization. Research at the regional level applying the theory of innovation absorption indicates that the region requires relevant prior knowledge and a compatible cognitive structure to accurately identify, integrate, and effectively absorb valuable knowledge (Ge and Liu, 2022). If there is too large of a gap in digitalization levels between regions, latecomer regions lacking sufficient digital technology and a complete information communication environment are unlikely to benefit from the computational power services provided by NSC. Finally, the cross-regional cooperative network formed by interaction among knowledge production organizations is regarded as another channel for knowledge diffusion. Advanced regions can obtain exogenous knowledge that is different from local knowledge through cooperation. If knowledge, experience, and resources from advanced regions can diffuse to less advanced regions, innovators in the latter can use the absorbed knowledge to create new scientific outputs, bridging the knowledge gap with the scientific forefront (De Noni et al., 2018; Erdil et al., 2022). The establishment of NSC could create policy spillover through the provision of computational power services to previously closely connected partners, thereby promoting local knowledge innovation and development. Based on the above analysis, we propose the following hypotheses:

H4a. The establishment of the NSC may facilitate policy spillover through proximity relations.

H4b. Building on existing policy spillover, the NSC fosters regional knowledge innovation convergence.

Development process and institutional background of NSCs

Main supercomputers in China

(1) The Tianhe series: In 1978, the Chinese government put forward the aim of creating a supercomputer and assigned the National University of Defense Technology with the task. The first computer in China that performs calculations at a speed of more than 100 million times per second, named “Yinhe-I”, was successfully appraised in Changsha in 1983, and subsequent models were released in succession. With the accumulation of previous experience and technology, the “Tianhe-1” and “Tianhe-2” were developed by the university and achieved top ranking in the TOP500 list in 2010, 2013, and 2015 respectively (please refer to Table 1 for details, including serial number 1, 3, and 4). (2) Dawning series: The “Dawning” series of supercomputers was developed by the Institute of Computer Science at the Chinese Academy of Sciences. In 1993, the “Dawning-I” was successfully developed, making a breakthrough for China in the field of Symmetric Multi-Processing. In 2010, the “Dawning Nebula” supercomputer achieved second place on the world’s supercomputer list (TOP500), representing the most significant accomplishment of the “Dawning” series (refer to number 2 in Table 1 for details). (3) Sunway series. In 1996, the National Research Center for Parallel Computer Engineering and Technology was established, signaling the beginning of the development of the “Sunway” supercomputers. In 2010, the “Sunway Blue Light” was created, and subsequently situated at the NSC in Jinan. In 2016, the “Sunway TaihuLight” supercomputer achieved first place on the TOP500 list (refer to number 6 in Table 1 for details). For a comprehensive exposition on the intricate trajectory of high-performance computing development in China, kindly refer to the work by Wang (2023).

Table 1 Distribution and overview of NSC in China.

Supercomputing center in China

The NSCs were established by the Ministry of Science and Technology with the aim of providing high-performance computational resources for scientific research in China. Since 2009, NSCs have been approved and constructed in different cities, including Tianjin, Shenzhen, Changsha, Guangzhou, Jinan, Wuxi, and Zhengzhou. The central government cooperates with local governments to jointly fund the construction of NSC. During the operation phase, some operating expenses are subsidized by local government finances. At the same time, central and local governments open topic applications to some researchers, who use part of the research funds to purchase computing resources. Currently, the NSCs, along with their supporting data storage and backup centers, have been applied in various fields such as biomedicine, genetics, aerospace, climate, marine science, artificial intelligence, new materials, new energy, neuroscience, and smart cities. Taking the NSC in Tianjin as an example, the “Tianhe-1” supercomputer has supported more than 2000 national science and technology major projects and national key R&D programs during its operation, with total funding exceeding 2 billion yuan. It has been recognized with both national and provincial-level awards, contributing to thousands of published academic achievements. (refer to https://www.nscc-tj.cn/index).

Data and methodology

Data and variables

Dependent variables

This paper aims to explore the impact of NSC on regional knowledge innovation. Broadly speaking, KI involves multiple aspects including knowledge acquisition, creation, and transformation, among which knowledge creation is the core. Innovation organizations such as universities and research Institutes engage in basic and applied research to pursue new scientific discoveries and generate new scientific knowledge. Scientific publications serve as the primary carrier of scientific knowledge, reflecting the latest advances in scientific research, and also an important channel for knowledge diffusion across different research fields and geographic locations (Qiao et al., 2016). The output of scientific papers in a city to some extent reflects the level of knowledge innovation in the region (Li, 2009; Yang et al., 2022a). Meanwhile, the publications database provides public access to the city information to which the researchers belong. Therefore, this study characterizes the level of knowledge innovation (KI) by the per capita number of scientific publications in a city and specifically measures it by the number of papers included in the Science Citation Index (SCI). In the robustness test, the criterion for highly cited SCI papers is defined as follows: sorting the number of citations of SCI papers in the same discipline and the same year, the papers ranked in the top 1% are considered as highly cited papers.

Independent variable

The variable NSC is a dummy variable that takes a value of 1 if the city approves the construction of an NSC in the current year or any year thereafter, and 0 otherwise.

Mechanism variables

(1)The basic effect variables. Firstly, considering the significant role of financial investment in the construction and operation stages of LSRIs, government financial support for S&T (R&D_exp, billion yuan) is selected as the proxy variable for R&D expenditure (Li, 2009; Liu and Li, 2021; Ge and Liu, 2022). Secondly, the scientific and technological talent (Li, 2009; Gao and Yuan, 2020) (human resources) is represented by the number of employees who conduct scientific research and technology services in the city (R&D_talent, 10,000 persons), which is essential for the construction, operation, and radiation effects of the NSC, requiring a sufficient number of knowledge-based personnel, particularly in STEM fields. Thirdly, digital infrastructure construction (physical resources) covers areas such as 5G, artificial intelligence, and industrial internet, reflecting the development of technological and material resources in the context of NSC (Zhang et al., 2022; Tang and Zhao, 2023). The construction of digital infrastructure is indirectly characterized by the city’s digitalization index, constructed using the entropy method based on secondary indicators including the number of mobile phone users and internet users, revenue of postal and telecommunications industry, number of relevant employees (in information transmission, computer services, and software industry). (2) The network effect variables. The network effect variable is related to the centrality and structural hole of actors in the network. Generally, an actor’s position in a network is considered more important if they have higher centrality and more structural holes. This study focuses on whether the construction of NSC can affect the direct connection between NSC cities and other regions, as well as its embedding in the network structure, rather than focusing on the control of knowledge flow between nodes. Therefore, we select indicators that reflect urban centrality, including degree centrality and closeness centrality (Wang et al., 2014; Han et al., 2021). (3) The technology effect variable. To measure the technology effect of NSC in terms of innovation efficiency (Innova_effi), we employ the stochastic frontier analysis (SFA) method, which is rooted in economic theory and allows for a more rigorous measurement of innovation efficiency (Li, 2009).

Control variables

(1) Economic development (Liu and Li, 2021; Lu et al., 2022; Gao and Yuan, 2020). Characterized by per capita GDP (Econ, 10,000 yuan). (2) Industrial structure (Tang and Cui, 2023; Lu et al., 2022): Measured by the share of the secondary industry in the GDP (Industry_sec, %). (3) Comprehensive growth rate (n + g + δ): Modeled according to the research of Yang (2021), and calculated as the sum of natural growth rate, technological progress rate, and capital depreciation rate, assuming that the sum of technological progress rate and capital depreciation rate equals 5%. (4) Financial sector development: Gauged by the aggregate amount of loans and deposits held by financial institutions (Finan, 10,000 yuan). (5) Human capital (Tang and Cui, 2023; Liu and Li, 2021; Gao and Yuan, 2020): Proxied by the number of university students per 10,000 individuals (H_cap). (6) Traffic and Openness (Yang et al., 2021; Gao and Yuan, 2020): Indicated by the total volume of passenger transport via road, water, and air transport for openness and commuting (Trans, 10,000 people).

Data sources and processing

The research sample in this study consists of panel data for 283 Chinese cities from 2000 to 2020. The number of papers indexed by the Science Citation Index (SCI) for each city is obtained from the Web of Science (WoS) database (https://www.webofscience.com). The centrality measure is based on constructing a city-level scientific research network matrix. To construct the matrix, information on scientific research collaborations among cities in China is obtained through Python. If authors from different cities appear in the same paper, it is considered as a collaboration between the cities, and the centrality of cities is then calculated using Ucinet software. The control and mechanism variables, including government financial support for S&T, the number of employees in scientific research and technology services, the number of mobile phone users and internet users, revenue of postal and telecommunications industry, number of relevant employees (in information transmission, computer services, and software industry), are obtained from the “China Urban Statistical Yearbook”.

Missing values are handled by the interpolation method or replaced with the mean of the city. As for the measurement of Innova_effi, we use government financial support for S&T, and the number of employees in scientific research and technology services as input indicators for financial and human resources. The output indicator is represented by knowledge innovation. Two models, the Cobb-Douglas production function model and the stochastic frontier analysis with translog function, are respectively used for calculation, and the generalized likelihood ratio test is used for verification. The results (LR chi2 = 77.76, P = 0.000) indicate that the stochastic frontier analysis with translog function is more suitable for calculating innovation efficiency. Descriptive statistics of the variables are presented in Table 2, and natural logarithms are taken for some variables (including Econ, Finan, H_cap, Trans) with values affected by price factors. As per Table 2, it is discernible that the standard deviations of a series of variables, including KI, are notably higher than the means. This indicates significant data dispersion, unveiling substantial inter-city disparities, a fact also elucidated by the distinctions between the maximum and minimum values. To scrutinize the influence of skewed data on the estimation results, in the section “Robustness test”, we conduct a robustness examination by substituting models. Additionally, recognizing the disparities between the treatment and control groups (descriptive statistics of the groups are retained), we acknowledge potential disruptions from bidirectional causality and sample self-selection biases in the baseline regression results. To address this issue, the study employs methods including IV, 2SLS, PSM-DID, and placebo tests.

Table 2 Descriptive statistics.

Methodology

Since 2009, the Ministry of Science and Technology has successively approved the establishment of NSCs in several cities. We take this as a quasi-natural experiment and regard cities with supercomputing centers as the treatment group and other cities as the control group to examine the effect of NSC on regional knowledge innovation. Due to differences in the construction time of supercomputing centers, the study first constructs a time-varying difference-in-differences (DID) model:

$${{KI}}_{{it}}={v}_{i}+{\mu }_{t}{+\beta }_{1}{{NSC}}_{{it}}+\gamma {Z}_{{it}}^{{\prime} }+{\varepsilon }_{{it}}$$
(1)

In the above equation, i and t represent specific cities and specific years, respectively. KIit is the knowledge innovation level of city i in t years. \({Z}_{{it}}^{{\prime} }\) represents other control variables that may impact the level of urban knowledge innovation. vi and μt represent individual-fixed effects that do not vary over time and time-fixed effects that do not vary across individuals, respectively. Theoretical analysis indicates that the establishment of NSCs not only could promote local knowledge innovation but may also affect neighboring areas. Thus, this study uses the spatial DID method to relax the assumption that individuals are independent in classical DID. The most common spatial econometric models are the Spatial Lag Model (SAR), Spatial Error Model (SEM), and Spatial Durbin Model (SDM). Based on the spatial autocorrelation of the dependent variable (significant Moran’s I at 1% level for each year, as shown in Table 3), this study follows the selection criteria proposed by Elhorst (2014) and conducts LM and Wald tests on the sample. Moreover, The Hausman and LR tests are also been conducted to assess the appropriateness of using a two-way-fixed effects model. Ultimately, the benchmark regression model adopts the SDM, as presented below:

$$\begin{array}{ll}{{KI}}_{{it}}={v}_{i}+{\mu }_{t}{+\rho W{{KI}}_{{it}}+\beta }_{1}{{NSC}}_{{it}}+{{\beta }_{2}{WNSC}}_{{it}}\\\qquad\quad+\,\gamma {Z}_{{it}}^{{\prime} }+{\beta }_{3}W{Z}_{{it}}^{{\prime} }+{\varepsilon }_{{it}}\end{array}$$
(2)
Table 3 The Moran’s I of KI (knowledge innovation).

In formula (2), ρ represents the spatial autoregressive coefficient, β2 and β3 indicate the impact of NSC construction and control variables in spatially related areas on knowledge innovation in the focal area, respectively. W is the spatial weight matrix. The inverse distance matrix between cities is mainly used as the spatial weight matrix in this study. The matrix is calculated by using the longitude and latitude data of each city (obtained from Baidu Map API) to calculate the spherical distance between two cities.

To investigate the mechanism through which NSC affects knowledge innovation, this study employs Alesina and Zhuravskaya’s (2011) mechanism test method. Utilizing a linear model, we established the impact of NSC on mechanism variables. Subsequently, it conducts a comparative analysis of the estimated coefficients of NSC in equations that control for mechanism variables and those that do not, aiming to validate the existence of such mechanisms (Gao and Yuan, 2020; Zhang and Wang, 2022; Chen et al. 2023a).

Assuming that the coefficient of NSC in Eq. (2) is significant, the mechanism variable is used as the dependent variable, and the treatment variable (NSC) is used as the independent variable for regression analysis. The specific formula is as follows:

$${{MEDIATING}}_{{it}}={v}_{i}+{\mu }_{t}{+\beta }_{1}{{NSC}}_{{it}}+\gamma {Z}_{{it}}^{{\rm{\text{'}\text{'}}}}+{{\beta }_{4}W{Z}_{{it}}^{{\rm{\text{'}\text{'}}}}+\varepsilon }_{{it}}$$
(3)

If the coefficient of the NSC in Eq. (3) is significant, then NSC and each mechanism variable will be included in the regression model with knowledge innovation as the dependent variable, as shown in Eq. (4). If the estimated coefficient of NSC decreases or is not significant, it means that the construction of NSC can affect the development of urban knowledge innovation through the mediating variable path.

$${{KI}}_{{it}}={v}_{i}+{\mu }_{t}{+\beta }_{1}{{NSC}}_{{it}}+{{MEDIATING}}_{{it}}+\gamma {Z}_{{it}}^{{\rm{\text{'}\text{'}}}}+{\beta }_{4}W{Z}_{{it}}^{{\rm{\text{'}\text{'}}}}+{\varepsilon }_{{it}}$$
(4)

In addition to studying the diffusion of knowledge across geographic proximity, this study constructs two types of adjacency matrices: one based on digitization distance (constructed from the reciprocal of the absolute difference of each city’s digital infrastructure index) and the other based on collaborative frequency (constructed from the number of collaborations between each city and other cities). Furthermore, following the method of Sala-i-Martin (1996), this paper examines the impact of NSC on knowledge innovation convergence. A detailed description of the process is provided in the section “Methodology”. Based on the conceptual framework, mechanism analysis, and research design presented earlier, the final research framework of this our study is illustrated in Fig. 2 below:

Fig. 2: Research framework diagram.
figure 2

NSC influences KI through basic effect, technology effect, and network effect, with the potential to shape regional knowledge innovation convergence through various proximities.

Results

Benchmarking

Table 4 reports the estimated results of NSC’s impact on knowledge innovation under the geographic proximity matrix. Columns (1) and (2) present the OLS regression results controlling for time and city-fixed effects. It can be observed that the estimated coefficient of the treatment effect variable NSC is significantly positive at the 5% level or higher (10.191/8.958), regardless of whether control variables are included. Columns (3) and (4) show the results of the spatial econometric model, with both the LM test and Wald test statistics being significant at the 1% level, ensuring the validity of the SDM used. Specifically, the estimated coefficients of NSC are significantly positive at the 1% level (10.184/8.934), suggesting that the establishment of NSC promotes local knowledge innovation. The estimated coefficients of the interaction term NSC×w are also significantly positive at the 1% level (47.319/37.966), indicating that NSC construction has a positive impact on knowledge innovation in surrounding areas. This can be attributed to the continuous improvement of transportation infrastructure and digital networks (Yang et al. 2021; Zhang et al. 2022), as well as the regional development strategy represented by urban agglomerations (Tang and Cui, 2023). The estimated results in Table 4 provide empirical evidence about the impact of the supercomputing center on regional scientific knowledge production and suggest that neglecting policy spillover effects would underestimate the influence of NSC on urban knowledge innovation, which is unfavorable for policy evaluation.

Table 4 Benchmark regression results.

Mechanism test

Test of basic effect

To explore the mechanisms through which NSC affects knowledge innovation, we conduct empirical tests from three levels: basics effect, network effect, and technology effect, based on the theoretical analysis in the section “Research hypothesis”.

Table 4 focuses on the basic effect of NSC, and column (4) in Table 4 indicates that NSC construction significantly improves urban knowledge innovation performance. The second-stage regression results, shown in columns (5), (7), and (9) of Table 5, indicate that the policy treatment effects are all significantly positive at the 1% level (6.216/3.228/0.057). This result suggests that NSC construction not only promotes regional S&T investment and an increase in R&D personnel but also helps improve digital infrastructure. Furthermore, similar to the knowledge innovation performance, the construction of NSC can also have a positive effect on the financial, human, and material resources of innovation in geographically adjacent regions. The phenomenon can be interpreted through existing literature. For instance, the Chinese government’s integration of S&T investment targets in the evaluation criteria for local officials has stimulated innovation competition, compelling neighboring city governments to augment their S&T investment (Liu et al., 2020; Gao and Yuan, 2020). Alternatively, the regional integration development strategies have minimized inter-regional transit time, facilitating the flow of R&D elements (Tang and Cui, 2023; Yang et al., 2021). The increase in S&T investment, as well as the aggregation of talent, will also propel industrial structure upgrading (Gao and Yuan, 2020), expedite the construction of digital infrastructure, and enable urban digital transformation.

Table 5 Basic effect test results.

Finally, columns (6), (8), and (10) in Table 5 show the third-step regression results, in which the S&T investment, R&D personnel, digital infrastructure, and treatment effect variables are all simultaneously included in the regression equation. The estimated coefficients of the mechanism variables are all significantly positive at the 1% level (1.023/1.369/90.641), and the promotion effects of NSC on knowledge innovation are still significant, but the absolute values of the coefficients have decreased. Theoretically, the innovation effects of investments in S&T, R&D personnel, and digital infrastructure have substantial empirical support. Firstly, concerning the impact of investments in S&T on innovation, a substantial body of research has demonstrated the stimulating impact of government subsidies on the R&D activities of enterprises. Studies have also focused on the innovation effects of public sector S&T investments, encompassing different dimensions like research institutes and cities, supporting that fiscal investment in S&T has led to an increase in both the quantity of scientific publications and patents (Link and Scott, 2021; Chen et al., 2023b). Secondly, since the proposition of endogenous growth theory, the accumulation of human capital has been considered the fountainhead of economic growth, significantly determining a nation’s innovative capacity. Liu and White (1997) have emphasized that innovation is driven by both absorptive capacity and new knowledge sources, with R&D personnel serving as a crucial manifestation of the former (Liu and White, 1997). Studies by Suseno et al. (2020), Lao et al. (2021), and Wen et al. (2023) have elucidated the innovation effects of high-level human capital from different perspectives. Thirdly, the innovation effects of digital (information) infrastructure are primarily realized through two mechanisms (Liu and Li, 2021; Zhang et al., 2022; Guo and Zhong, 2022; Ma and Lin, 2023; Tang and Zhao, 2023): (1) by reducing information asymmetry; (2) by breaking through administrative boundaries and geographical distances, facilitating information exchange and knowledge spillover among innovative entities.

Given the aforementioned statistical results and theoretical foundation, we are justified in deducing that NSC can promote the development of knowledge innovation through the impact on regional financial resources, human resources, and material resources, and the basic effect in hypothesis 1 has been tested.

Test of network effect and technology effect

Following the same methodology as the basic effect test, Table 6 reports the regression results of network effect and technology effect, using column (4) in Table 3 as the benchmark test (first step).

Table 6 Network effect and technology effect test results.

On the one hand, taking Centrality_Degree and Centrality_Closeness as dependent variables, it can be seen from columns (11) and (13) in Table 6 that the regression coefficients of NSC are both significantly positive at the 1% level (9.069/8.680), indicating that the construction of NSC has improved the centrality of city in the regional research cooperation network, that is, promoting the embedding of the city in the network structure. By relying on the supercomputer comprising thousands of processors and extending the development of new digital technologies such as big data, artificial intelligence, and cloud computing, NSC construction has not only expanded the city’s computing power services but also enhanced the region’s information and knowledge processing capabilities. Other cities are also more willing to establish cooperative relationships with NSC cities, thus promoting the embedding of the regional innovation network. Columns (12) and (14) in Table 6 report the regression results for the third step of the network effect, where Centrality_Degree, Centrality_Closeness, and NSC are simultaneously included in the regression equation. The estimated coefficients of the mechanism variables are both significantly positive at the 1% level (0.736/0.528). The promotion effect of NSC on knowledge innovation remains significant, and the estimated coefficients decrease from 8.934 to 2.226 and 4.329. In accordance with the social network theory, disparities in the positioning of individuals within a network can significantly influence the quantity and quality of information and resources they acquire, which leads to variations in innovative performance. Existing literature, based on diverse samples, has unveiled the augmented innovative performance associated with higher centrality in networks (Han et al., 2021; Wang et al., 2019). When entities occupy more central positions, they can engage in multidimensional technical collaborations and knowledge exchanges with various members. This enhances their capacity to absorb, transform, and reconfigure knowledge.

This result confirms that the construction of NSC can enhance the level of knowledge innovation by promoting the city’s embedding in the scientific cooperation network, thereby verifying hypothesis 2.

On the other hand, column (15) in Table 6 presents the second step estimation result of the technology effect mechanism. It shows that the estimated coefficient of NSC is significantly positive at the 1% level (0.091). After including Innova_effi and NSC in the regression equation, the estimated coefficient of the mechanism variable remains significantly positive at the 1% level (55.269). The absolute value of NSC’s estimated coefficient decreases from 8.934 to 3.884 while still being significant. Indeed, the significance of knowledge production efficiency in innovation is primarily manifested through two dimensions. Firstly, R&D activities entail considerable risks and uncertainties, and high-efficiency aids in mitigating the costs associated with the knowledge innovation process. Secondly, there is the constraint of limited R&D resources. High efficiency implies a more effective utilization of funds and human resources, enabling the realization of a greater quantity and higher quality of innovative outcomes with the same inputs. Existing studies not only reveal the positive impact of technological efficiency enhancement on innovation performance but also underscore the crucial roles played by management efficiency and resource allocation efficiency in the innovation process (Bughin and Jacques, 1994; Hu and Chen, 2016; Yang et al., 2022b).

The theoretical analysis and statistical results above indicate that NSC promotes the development of knowledge innovation by improving regional innovation efficiency, which confirms the technology effect (hypothesis 3). By enhancing the city’s computing power and allocation efficiency, NSC not only shortens the cycle of knowledge innovation and reduces its costs, but also optimizes the allocation of S&T resources, achieving scientific decision-making for urban innovation development.

Further analysis

Policy spillover

Geographical proximity is not the only pathway that affects the diffusion of knowledge. With the rapid development of digital technology and the increasing improvement of infrastructure, the cross-regional flow of R&D elements makes the spatial connection of different cities closer. This study further examines the policy spillover of NSC from two aspects: cooperation proximity and digitization proximity. To address the problem of unclear coefficient economic implications, LeSage and Pace (2009) proposed the use of the partial derivative matrix method to divide the impact of the independent variable on the dependent variable into direct effect and indirect effect. In this study, the impact of the local NSC on knowledge innovation is considered a direct effect, while the impact of other regional NSCs on local knowledge innovation is regarded as an indirect effect.

The estimated results are shown in Table 7, revealing that regardless of whether the geographical proximity matrix, cooperation proximity matrix, or digitization matrix is utilized, the estimated coefficients of NSC in both direct and indirect effects are significantly positive, confirming the existence of policy spillovers of NSC under different matrices (H4a is verified). This indicates that, in addition to radiating to geographically adjacent areas, NSC provides computing services to closely connected partners, thereby promoting local knowledge innovation development. Furthermore, the policy spillover effect of the supercomputing center is more effective when cities have comparable levels of digitization. The above results verify the policy spillover at the cooperative dimension, indirectly indicating that if the digital technology level of the city is limited, local knowledge innovation development is difficult to benefit from the NSC construction in the advanced areas. Considering the size and significance of the indirect effect coefficient, it can be seen that under the geographic and digitization matrices, the indirect effect of NSC is stronger, reflecting that both geographic distance and economic (digitization) distance are still the primary factors influencing policy spillover.

Table 7 Policy spillover effect test results.

Given the existence of policy spillover in NSC, we construct β-convergence model based on the way of Sala-i-Martin (1996), examining the impact of NSC on regional knowledge innovation convergence. As depicted in Eq. (5) below, where \(L.{\mathrm{ln}{KI}}_{{it}}\) represents the lagged term of knowledge innovation, and \(D.{\mathrm{ln}{KI}}_{{it}}\) is the first-order differencing term for it. Our focal point lies in the alteration of \({\beta }_{0}\) before and after the inclusion of NSC. Should it be statistically significant (less than zero), notably increased in absolute value, it would signify that the establishment of NSC contributes to fostering inter-regional convergence in knowledge innovation.

$$\begin{array}{ll}D.{\mathrm{ln}{KI}}_{{it}}={v}_{i}+{\mu }_{t}+\rho {WD}.{\mathrm{ln}{KI}}_{{it}}+{\beta }_{0}L.{\mathrm{ln}{KI}}_{{it}}\\\qquad\qquad\quad+\,{\beta }_{1}{{NSC}}_{{it}}+\gamma {Z}_{{it}}^{{\prime} }+{\varepsilon }_{{it}}\end{array}$$
(5)

The regression results are depicted in Table 8. Whether using spatial econometric models or ordinary least squares, the coefficient of the lagged term L.lnY for knowledge innovation is significantly negative at the 1% level, implying that, after taking into account factors such as per capita GDP, comprehensive growth rate, and industrial structure, the latecomer regions have a higher knowledge growth rate than the knowledge-intensive regions. Columns (24) and (26) present the regression results after incorporating NSC, where the coefficient of L.lnY remains directionally and significantly unchanged, with only a slight increase in absolute value from 0.632 and 0.630 to 0.634 and 0.652. This indicates that although NSC can achieve policy spillover through geographical proximity, cooperation proximity, and digitization proximity, the impact is not sufficient to drive regional knowledge innovation convergence. Hypothesis 4b is not significantly supported by the results.

Table 8 Convergence effect test results.

Heterogeneity analysis

The antecedent findings corroborate the knowledge innovation effects of NSC and unveil its primary mechanisms. Nevertheless, this impact may vary due to differences in urban knowledge orientation and scientific environments. This paper examines this heterogeneity in three distinct ways.

In comparison to research in aerospace, meteorology, engineering simulation, and other fields, basic scientific research may be less affected by NSC, despite collaborative research in areas like new energy, new materials, particle-liquid simulation, and condensed matter physics within Chinese NSCs. In erecting a single NSC in China, the government typically invests tens of millions of dollars at least, aspiring that NSC advancements will tackle tangible societal challenges and propel economic innovation. Yet, per information gleaned from prominent Chinese supercomputer portals and media coverage, the utilization of supercomputing in the realm of Mathematics seems relatively rare. Thus, based on “Research Area” information from the WoS database, scientific publications affiliated with each city under “Mathematics” are obtained. Firstly, cities are then classified into high-percentage groups (City_B) and low-percentage groups (City_NB) based on the proportion of publications (see “Critical value” in Table 9, where 0.03 signifies that if the city’s mathematics publications exceed 0.03 in proportion, it is categorized as “City_B” with a value of 0). Despite Bdiff command tests indicating no significant differences in NSC coefficients between the two groups, the absolute value of the NSC coefficient in the “City_NB” group is slightly higher than that in the “City_B” group.

Table 9 Heterogeneity analysis based on grouping and moderating effect.

Secondly, this paper constructs the interaction term (NSC×Basic) to examine the moderating effect of urban knowledge orientation on the impact of NSC, as shown in columns (31) and (32) in Table 9. The estimated coefficients of the interaction term (NSC×Basic) are significantly negative at the 1% level (−2.956/−2.573), indicating that the knowledge innovation effects of NSC tend to be lower in cities with a high proportion of “Mathematics” publications.

Thirdly, heterogeneity tests conducted through grouping or moderating effects cannot be precise for each individual and often pale in comparison when dealing with fewer groups. This paper further employs the synthetic difference in differences proposed by Arkhangelsky et al. (2021) to estimate the individual treatment effects (ITE) of cities. The estimation results in Table 10 provide the average treatment effects, T-values, and 95% confidence intervals for each city in the treatment group. It is noteworthy that only Shenzhen and Guangzhou exhibit significant treatment effects (8.938/3.685), and the Basic values in these two major cities are lower than the mean of the treatment group. Among them, Shenzhen stands out as a typical application innovation-oriented city (while also facing criticism for lacking a layout in basic research).

Table 10 Heterogeneity analysis based on synthetic difference in differences.

Robustness test

Parallel trend test

The selection of an NSC site requires consideration of both the economic foundation of the city itself and its radiating influence in the region. Typically, the chosen city already possesses a relatively advanced knowledge base. To ensure the SDID model satisfies the “parallel trend” assumption prior to shock, we further examine the trend changes in both NSC and non-NSC cities. The equation is set as follows:

$$\begin{array}{ll}{{KI}}_{{it}}={v}_{i}+{\mu }_{t}{+\rho W{{KI}}_{{it}}+\beta }_{1}\mathop{\sum }\limits_{k\ge -9}^{+11}{{NSC}}_{2009+k}^{{\prime} }\\\qquad\quad+\,\gamma {Z}_{{it}}^{{\prime} }+{\beta }_{3}W{Z}_{{it}}^{{\prime} }+{\varepsilon }_{{it}}\end{array}$$
(6)

The study focuses on the coefficient β1 of the interaction term between the time dummy variable and the NSC city dummy variable (if the city has NSC, the value is 1; otherwise, it is 0), as shown in Fig. 3. The observation of the treatment effect can be divided into two stages. The first stage is before 2009, where it can be observed that the estimated coefficients of the interaction term are not significant, indicating no statistically significant differences in knowledge innovation changes between the treatment and control groups before policy implementation. The second stage is from 2009 to 2020, during which the policy treatment effect began to emerge in the second year of NSC construction and has been increasing year by year. The model satisfies the pre-assumption of “parallel trends,” while also presenting the dynamic changes of the treatment effect.

Fig. 3: Parallel trend test.
figure 3

Following the construction of NSC, the estimated coefficients gradually become significant, and the effect of policy begin to manifest. This also indicates that the DID model satisfies the assumption of pre-parallel trends.

Endogenous processing

Due to the potential inclination of NSC construction sites toward cities with superior digital infrastructure, these urban centers often exhibit a heightened level of knowledge production. To mitigate the bias stemming from sample selection, reverse causality, and omitted variables, we try to address this issue under both OLS and spatial econometric models:

On the one hand, regarding the NSC as an endogenous variable, this study selects the per capita-fixed telephone ownership in 1984 (FT) (Li and Wang, 2022), relief degree of land surface (Rdls) (Zhang et al., 2022), and the frequency of digital economic policy terms (FDEPT) (Jin et al., 2022; Tao and Ding, 2022) as instrumental variables. These are chosen as instruments based on their correlation with the endogenous variable and independence from the error term. The historical level of information infrastructure influences the subsequent development of digital technology in the region; computing efficiency depends on data transfer speed (connectivity), which is influenced by the Rdls of the city (the cost and difficulty of constructing digital infrastructure); whether a region is selected as an NSC construction city is also influenced by the degree of emphasis on digitization in public sector policies. In terms of exogeneity, historical variable represented by FT and the geographical variable represented by Rdls have exclusive characteristics. Given that FT and Rdls are both cross-sectional data, this study adopts the approach outlined by Nunn and Qian (2014). We multiply the previous year’s nationwide total of internet and mobile phone users by FT, while Rdls is multiplied by the time trend terms.

The validity tests for the instrument variable selection are presented in Panel A of Table 11, where the Kleibergen-Paap rk LM statistics are significant at the 1% level, F-values are all greater than 10, both Cragg-Donald Wald and Kleibergen-Paap rk Wald statistics exceed the critical values of the Stock-Yogo weak ID test (10% maximal IV size). This suggests that the three types of instrumental variables do not suffer from “under-identification” and “weak instrument” problems. Columns (33), (34), and (35) show the regression results for the first stage, indicating that the estimated coefficients of FT and PWF are significantly positive at the 1% level (0.249/5.016). It implies that the historical level of information infrastructure and the policy attention of the public sector to digitization indeed have a positive impact on whether a city is selected as an NSC. In contrast, the estimated coefficient for Rdls is significantly negative at the 1% level (−0.015), reflecting that higher Rdls do hinder a city from being selected as an NSC city. The results of the second-stage regression are shown in columns (36), (37), and (38), with NSC estimated coefficients all being significantly positive at the 1% level (69.125/37.768/20.310).

Table 11 Endogeneity treatment (IV).

On the other hand, we try to address the endogeneity issue in the spatial econometric model in three different ways. (1) Dynamic SDM. Compared to static models, the dynamic SDM is advantageous in its more comprehensive consideration of time factors. This study sequentially includes the time-lagged dependent variable (dlag_1), the space-time-lagged dependent variable (dlag_2), and both of them (dlag_3) as explanatory variables in the regression model. The results shown in columns (39), (40), and (41) of Table 12 indicate that the estimated coefficients of the NSC are significantly positive at the 1% level (98.422/8.886/99.982). (2) Generalized spatial two-stage least squares method. Following the approach of Wang et al. (2022) we select the independent variable and its spatial lag term as instrumental variables. The regression results are shown in column (42) of Table 9. Whether using first-order (1st order), second-order (2nd order), or third-order (3rd order) lagged independent variables (present only the results for the 1st order), the estimated coefficients of the NSC are also significantly positive at the 1% level. (3) Incorporating instrumental variables like FT, Rdls, and FDEPT into the G2SLS model, the results of the subsequent regression demonstrate that the NSC estimated coefficients still remain significantly positive at the 1% level (68.396/31.614/23.229).

Table 12 Endogeneity treatment (GS2SLS).

The above outcomes indicate that potential endogeneity concerns do not significantly affect the validity of the baseline results.

Placebo and PSM-SDID

Building upon parallel trend tests, this study employs counterfactual analysis to further perform placebo analysis. By changing the construction time of the NSC and investigating the treatment effect determines whether the improvement in urban knowledge innovation is caused by the NSC. If the coefficient is significant, it suggests that the improvement of urban knowledge innovation level may not be caused by NSC, and the conclusion is not robust. Referring to Gao and Yuan’s research (2020), we only retain the samples from the period between 2000 and 2008, estimating them again by respectively moving the policy time forward one period (2008), two periods (2007), and three periods (2006). The results are shown in columns (46), (47), and (48) of Table 13, with the NSC being insignificant, indirectly proving that the improvement in knowledge innovation level is attributed to the NSC. Moreover, this study employs the PSM-SDID method to conduct robustness checks on the original model, to overcome potential endogeneity issues caused by selection bias, and to enhance the accuracy of causal identification results. Using the year-by-year method to perform kernel matching, Econ, Industry_sec, n + g + δ, and the proportion of fiscal S&T expenditure to GDP are selected as a covariate. The standardized bias of each covariate after matching is less than 20 percent. Considering the requirements of spatial econometric models for balanced panel data, the samples with missing data in the year are removed, and ultimately, 714 samples are retained. As shown in columns (49), (50), and (51) of Table 13, whether the spatial econometric model is adopted or not, the estimation results are consistent with the benchmark regression results, indicating the robustness of the positive impact NSC has on urban knowledge innovation obtained in the previous analysis.

Table 13 Placebo and PSM.

Replacing the estimation method, variable, and sample

(1) Change the estimation method. Given that some cities have a value of zero for knowledge innovation, which accounts for a certain proportion of observations, the dependent variable being clustered on the left side of the value range may lead to biased estimation. Therefore, Tobit and negative binomial models are used to re-estimate the results. As shown in columns (52) and (53) of Table 14, the estimated coefficients of NSC are significantly positive at least at the 1% level (8.958/0.236), indicating that the benchmark test results are not significantly affected by the structural characteristics of data. (2) Replace the dependent variable. Firstly, the knowledge innovation variable in our study is constructed by taking the ratio of urban S&T publications to the number of permanent urban residents. We replace permanent residents with urban employees to construct a new knowledge innovation variable and conduct another estimation. The estimated coefficient of the NSC is also significantly positive at the 1% level (25.857). Secondly, by using the number of highly cited papers in urban as the dependent variable, the estimated coefficient of NSC is significantly positive at the 1% level (0.491). The treatment effect of the policy remains robust, and both the quantity and quality of knowledge innovation are measured, achieving cross-validation. (3) Change sample. Considering that the small number of treatment groups in the sample may cause bias to the estimation, this paper deals with it by changing the sample in two ways: on the one hand, the number of samples (control group) is deleted. The study only retains 35 large and medium-sized cities in China and deletes samples of other cities for re-estimation. On the other hand, change the sample dimension. We raise the dimension to inter-provincial (31 provinces, municipalities, and autonomous regions), and Tianjin, Guangdong, Shandong, Jiangsu, Hunan, and Henan are respectively used as treatment groups (the data on the publication of S&T in the provincial area comes from the “China Science and Technology Statistical Yearbook”). Table 14 shows the SDID regression results (columns (56) and (57)), and the estimation coefficients of NSC are all significantly positive (3.020 /3.338), indicating that the previous research results are very robust.

Table 14 Replacing the estimation method, variable, and sample.

Conclusion and policy implications

Discussion and conclusion

As an important component of the national innovation system, LSRIs possess the capability to explore the unknown world, discover natural laws, and achieve S&T outputs. Existing research has revealed the impact of LSRIs on socio-economic development (especially in S&T innovation) from different perspectives (Marcelli, 2014; Michalowski, 2014; Qiao et al., 2016; Beck and Charitos, 2021), and theoretically explored the various dimensions of LSRIs’ scientific effect (Michalowski 2014; Qiao et al., 2016). However, there are two primary challenges in evaluating the impacts of construction: firstly, insufficient examination of the link between LSRIs and regional knowledge production; secondly, limited testing conducted within a causal inference framework. This study examines the impact of LSRIs on regional knowledge innovation with the backdrop of the Chinese NSC. The research results reveal the positive significance of this effect to a certain extent, directly confirming the scientific effect or S&T advancement effect of LSRIs mentioned in existing literature (Michalowski 2014; Qiao et al., 2016). Through mechanism testing, the identification of network effect (Lozano et al., 2014; Qiao et al., 2016; D’ippolito and Rüling, 2019; Beck and Charitos, 2021), capability cultivation (Michalowski 2014; Qiao et al., 2016), and clustering effect (Qiao et al., 2016; Beck and Charitos, 2021) is indirectly achieved. While increasing regional scientific financial, human, and material resources, LSRIs also contribute to the embedding of cities in regional innovation networks and the efficiency of utilizing innovation resources. Qiao et al. (2016) considered that the network effect is an important mechanism for LSRIs to interact with science stakeholders and strengthen scientific cooperation. Based on the co-publication of scientific publications, this study extends such network effects to the scientific cooperation connections established between cities. Other cities are more willing to establish cooperative relationships with NSC cities because they can benefit from computing power services. The improvement of data processing capabilities will also alleviate information overload problems and stimulate NSC cities to actively integrate into the innovation network. Some studies have mentioned the function of LSRIs in technology promotion and knowledge diffusion (Beck and Charitos, 2021; Scarrà and Piccaluga, 2022). Based on the spatial econometric model, our research results reveal the spillover effect of LSRI implementation, and this diffusion mechanism exists on multiple levels, including geographical proximity, cooperation proximity, and digitization proximity.

As a new productivity in the digital economy era, computing power plays an important role in promoting S&T progress, industry digital transformation, and economic and social development. NSC has the dual attributes of LSRI and digital infrastructure. Therefore, the research findings of this study also complement the literature regarding how digital infrastructure impacts the growth of innovation. Previous research has examined the impact of digital infrastructure on productivity and innovation from different dimensions including region and enterprise (Cardona et al., 2013; Balcerzak and Bernard, 2017; Zhou et al., 2021; Zhang et al., 2022; Tang and Zhao, 2023). However, the definition of digital infrastructure is mainly focused on network and communication infrastructure, lacking involvement in computing power. This paper provides direct evidence of how computing infrastructure impacts regional knowledge. NSC supplies high-performance computing services for scientific research, improving research and development efficiency, and shortening the output cycle of scientific research results (Marcelli, 2014). Moreover, it drives the development of new digital technologies represented by 5G, big data, cloud computing, and artificial intelligence, promoting the development of regional knowledge innovation. In addition, NSC is a national-level computing power hub established by the government based on urban innovation ecosystems in specific geographic locations, undertakes multiple missions of promoting local digital innovation, and accelerating knowledge spillover. Therefore, NSC can also be regarded as a place-based innovation policy. The research findings of this study reveal the significant impact of the intervention on local knowledge production. However, the driving effect of local knowledge growth on the convergence of regional innovation is limited, which is different from the evaluation results of other place-based innovation policies like “National Innovative City” and “urban cluster” (Tang and Cui, 2023; Gao and Yuan, 2020). The main reasons could be that the number of cities with NSC is still limited, and the construction of provincial and even more microscopic-level supercomputing centers has not been considered, which may lead to an underestimation of the radiation effect from the center. It is noteworthy that, akin to certain assessments of policy or digital innovation effects (Zhou et al., 2021; Liu and Li, 2021; Zhang and Wang, 2022; Tang and Zhao, 2023), our study encapsulates the inter-regional heterogeneity of NSC knowledge innovation effects. Diverging from existing research that relies on economic or geographical heterogeneity analysis (Yang et al., 2021; Gao and Yuan, 2020; Chen et al., 2023b), our findings further unveil potential disparities in NSC innovation effects due to differences in urban scientific knowledge development emphasis.

In summary, the findings of this study can be distilled into the following key points:

  • NSC construction promotes local and surrounding area knowledge innovation.

  • The main mechanisms by which NSC promotes regional knowledge innovation include the increase in fiscal investment and talents in S&T (basic effect), the improvement of digital infrastructure (basic effect), as well as the enhancement of urban network centrality(network effect), and innovation efficiency(technology effect).

  • Geographical proximity, cooperation proximity, and digitization proximity constitute the main channels of policy spillover.

  • NSC has not shown a significant promoting effect on regional innovation convergence, and the radiation influence needs to be further improved.

  • Knowledge innovation effects of NSCs vary based on differences in urban knowledge orientation and scientific environments, with the treatment effects being notably pronounced in application innovation-oriented cities, exemplified by Shenzhen.

Policy implications

Firstly, considering the facilitating role of NSC in scientific knowledge production, it is necessary to enhance the supporting effect of LSRIs in scientific basic research and technological application research. While improving R&D efficiency, releasing the attraction of the large-scale scientific projects to innovative factors, and increasing the investment in S&T and the number of R&D personnel, improving urban digital infrastructure, and promoting the deep embedding of cities in scientific research collaboration networks.

Secondly, The study emphasizes the need to strengthen the policy spillover effect through various channels. This can be achieved by developing a city cluster strategy that coordinates the collaborative network of computing power within and around urban areas such as the Beijing-Tianjin-Hebei, Yangtze River Delta, and Greater Bay Area regions. For cities that have not yet established NSC, efforts should be made to optimize regional digital infrastructure and actively integrate into inter-regional cooperation networks, in order to create a favorable environment and basic conditions for cross-regional computing power scheduling, as well as to expand knowledge spillover in digitization and cooperation proximity.

Thirdly, given the weak promotion of NSC on regional knowledge innovation convergence, in the future, to strengthen the role of the national computing power hub as a connector and coordinator in the overall layout of the national computing power network, the computing infrastructure layout should be systematically optimized, with a focus on guiding the reasonable hierarchy for the layout of general data centers, supercomputing centers and intelligent computing centers. In this process, addressing the “computing power island” problem and expanding the influence of the center is an urgent issue that requires providing inter-city computing power collaboration and on-demand scheduling solutions.

Fourthly, the findings of this study reveal that the implementation of a place-based innovation policy, through the strategic establishment of NSCs in different regions, can effectively facilitate the growth of local knowledge creation. However, in order to achieve inter-regional convergence of knowledge, a concerted effort to refine inter-regional coordination mechanisms needs to be undertaken, simultaneously with the expansion of the centers. The limited yet positive contributions of NSCs also furnish valuable insights for other countries in the construction of LSRIs, digital infrastructure development, and implementation of place-based innovation policies. Particularly, in the selection of NSC locations, there should be consideration for regional disciplinary emphasis and the innovation environment, coupled with increased support for fundamental research.