1 Introduction

Like many other kinds of economic activity, Research and Development (R&D) is spatially concentrated but also globally distributed (Dicken 2007; Fujita et al. 2001; Malecki 2014). As a result most of the world’s R&D expenditure is found in a relatively small number of city-regions around the world. Some of these city-regions are well known and have been extensively described in the scientific literature, such as California’s Silicon Valley (Castells 2014; Saxenian 1996). But the global position of many other city regions is less clear. For example, while Seoul is well known as a global R&D hub, is the R&D activity of other cities in South Korea also globally significant? (Yoon and Park 2017) And what about second-tier cities in Europe, the United States and Japan? Or emerging R&D centres in India and China? (Crescenzi and Rodríguez-Pose 2017).

A number of global city rankings have been produced in recent years, including the Global Cities Report by consulting firm A.T. Kearney, the JLL Typology of World Cities by the real estate firm Jones Lang Lasalle and the The World According to GaWC by Loughborough University’s Globalization and World Cities Research Network (Beaverstock et al. 1999), but none of these focus exclusively on R&D or innovation activity. On the other hand there are numerous international rankings of innovation, including the Global Innovation Index (Dutta and Lanvin 2016), Global Competitiveness Report (Schwab and Sala-i-Martin 2015) and the Bloomberg Innovation Index by Bloomberg, a media company. These indexes all provide insight at the national level but ignore the sub-national city level. Therefore insight into the global innovation position of cities is lacking.

Because city-regions are the spatial scale at which innovation activity is concentrated, the purpose of this paper is to identify the world’s most important city-regions in terms of R&D expenditure. Critical in this process is the use of patent data which is used (a) to identify high R&D city-regions, (b) as a proxy to estimate city-region R&D expenditure and (c) it allows the study of high R&D city-regions over longer timescales. These points are elaborated upon in the Methodology.

The ability to study city-regions over longer timescales allows global shifts in R&D expenditure to be observed (Dicken 2007) and it addresses the important theoretical question about the extent to which R&D activity is path-dependent or whether significant path-changing forces also exist (Boschma and Frenken 2006; Crescenzi and Rodríguez-Pose 2011). Three five-year periods from 1997 to 2011 are considered in the study and they show significant changes in the global hierarchy of innovation city-regions, especially the rise of city-regions in Asia. A table of the 20 city-regions with the greatest R&D expenditure is provided in the Results. Before that the theoretical background of the geography of R&D activity and global shifts is briefly discussed under Theory.

2 Literature review

R&D activity is highly globalised and spatially concentrated at the same time (Feldman 1994; Malecki 2014; Storper 1997), two features which at first may appear contradictory because globalisation suggests footlooseness, while spatial concentration suggests that specific territorial advantages do exist (Bathelt et al. 2004; Binz and Truffer 2017; Gertler and Wolfe 2006). Spatial concentration is driven by economies of scale effects, the lowering of transaction and collaboration costs and a greater variety of local suppliers, customers and skilled labour, as well as less tangible social-environmental factors such as trust, local competition and creativity (Capello 2009; Feldman and Kogler 2010; Porter 1998). However spatial concentration can also lead to negative economies of scale effects such as congestion, higher cost and excessive competition for customers and resources (Martin and Sunley 2003). The effect of geographic concentration on firm growth therefore appears to be ambiguous (Frenken et al. 2015; Grillitsch and Nilsson 2017), although spatial concentration does appear encourage firm entry (Frenken et al. 2015).

While there is ambiguity over the influence of spatial concentration on R&D activity, there appears to be strong path dependence, whereby spatial concentrations of R&D activity persist over long periods of time (Boschma and Frenken 2006; Crescenzi and Jaax 2017; Crescenzi and Rodríguez-Pose 2011). Path-creating factors, which could lead to significant shifts in the spatial concentration of R&D activity are public policy intervention and foreign direct investment in R&D (Crescenzi and Jaax 2017; Dicken 2007). Investment in higher education and basic research, as well as targeted public–private sector R&D collaboration, play an important role in establishing and attracting high technology industries to a particular location (Lee and Lim 2001; Lee et al. 2009). In addition the need to distribute R&D activity globally is primarily driven by the expanding complexity of technology. Technological complexity requires increasingly specialised knowledge which may only be available in a few locations and competitive pressures which encourage firms to seek the best researchers and collaboration partners globally (Audretsch et al. 2014; Locke and Wellhausen 2014). The globalisation of R&D has been facilitated by improvements in communications technology and a fall in air transportation costs enabling frequent online and face-to-face contact (Gertler 2003; Maskell et al. 2006).

The speed at which the spatial concentration of R&D shifts and high R&D city-regions grow or decline, provides an indication of the degree of path-dependence and path-creation in the current global R&D system. Significant spatial changes would suggest that the current global research and innovation environment is in flux with significant path-creation occurring due to public intervention, international private sector investment or other factors.

3 Data and methodology

The methodology consists of two main components: the identification of innovation city regions and the estimation of R&D expenditure using on patent data. The two components are sequential in the sense that the identified innovation regions and their patent counts are used as proxies for the R&D expenditure estimation.

3.1 Identification of high R&D city regions

The identification of high R&D city regions is based on patent data obtained from the PatentsView database which is published by the Office of Chief Economist in the United States Patent and Trademark Office (USPTO) and contains data on 6,647,699 patent grants from the USPTO (May 2018 edition).Footnote 1 Because the United States is a large economy, many foreign entitites also apply for patent protection at the USPTO, providing global coverage of patents, although with a home bias in favour of domestic United States patents. This home bias requires corrections to be made when comparing the US to other countries. The advantage of using a single source of patents (USPTO) is that all patents are granted in accordance to a single standard, improving the validity of making international comparisons (Toivanen and Suominen 2015).

Alternative databases to the USPTO database would be the European Patent Office (EPO) and the Japan Patent Office (JPO) as these also serve large advanced economies. However of the three databases, the USPTO database offers the greatest international coverage (Kim and Lee 2015), making it the most suitable for international comparative studies such as this one. In future China’s State Intellectual Property Office (SIPO) database may be an alternative to the USPTO, as China’s economic and technological importance continues to grow.

The USPTO database contains basic bibliographic information of patent documents such as patent identification numbers, application dates, inventors and assignees, the addresses of inventors and assignees, and patent citations, along with technological classifications. In this study the patent inventor addresses and application dates are used to identify the time and place where the R&D activity that resulted in the patent took place. The address allows for the identification of a geographic location of the inventive activity that led to the patent application.

The choice of inventors (individuals who carried out the R&D) rather than assignees (typically firms that financed the R&D) is not trivial. Inventors’ location provides information about where the R&D took place whereas the assignee location provides information about who owns the inventions. Given the globalization of R&D activity, assignees and inventors are frequently found in different countries.

Patent data allows for the identification of high R&D city-regions worldwide based on the actual local patenting intensity (Alcácer and Zhao 2016; Duranton and Overman 2005). This is an advantage compared to other statistical data, which is often based on administrative boundaries. Examples of sub-national administrative boundaries are states (Germany), regions (Italy), provinces (Spain), departments (France), counties (England) and prefectures (Japan). In some cases specific statistical boundaries exist, such as the metropolitan statistical area (United States). The first disadvantage of using administrative boundaries is their fit compared to the phenomenon being studied. Administrative boundaries may not fit well with actual innovation activity, which could spill across boundaries and/or take place in only a small part of the administrative area. An aggregate statistic for the administrative area could therefore give an inaccurate picture of the real spatial concentration of innovation activity. A second disadvantage of administrative boundaries is the difference in scales between countries. For example, China’s Guangdong province (179,800 km2, 108.5 million people) cannot really be compared to the Netherlands’ Zuid-Holland province (2818 km2, 3.6 million people), even both are the most populous provinces in their respective countries. Such differences make global comparisons at the sub-national level very challenging.

High R&D city-regions can be defined as geographic concentrations of patenting activity. The identification of concentrations is based on the inventor address information obtained from the patent database. This address information is converted into coordinates through geocoding. For example, the address ‘Delft, The Netherlands’ is converted into the coordinates 51.9995142, 4.2938295.

Although the PatentsView database does provide coordinates for patent addresses, upon closer examination a number of these appear to be inaccurate because the coordinates are located in a different country than the address or the coordinates are only located at the country or state level, and not at that of a town or city, which is a problem in larger countries. For this reason approximately 6.5% of addresses are geocoded again, a process carried out in three stages.

In stage one all addresses in countries or territories which are less than 20,000 km2 in size are automatically assigned coordinates. The largest among this group is New Caledonia (18,575 km2), also included are countries and territories such as Kuwait, Montenegro, Qatar, Cyprus, Puerto Rico, Luxembourg, Hong Kong and Singapore.

In stage two the open-source TwoFishes geocoding application is used (using index files updated on 2015-03-05).Footnote 2 TwoFishes is used and maintained by FourSquare Labs Inc., a company that operates a popular local search-and-discovery service mobile application. An important advantage of TwoFishes is that it is open source and therefore its search results are reproducable. Twofishes has been used in published and peer-reviewed scientific papers (Hamstead et al. 2018; Sessions et al. 2016) and it is listed in The SAGE Handbook of Social Media Research Methods (Sloan and Quan-Haase 2017).

In stage three the Google Maps geocoding application is used, which appears to have a higher success rate than TwoFishes, but it is a closed-source commercial service, making it potentially more difficult to reproduce results. Google Maps is also used in other scientometric studies (Leydesdorff and Persson 2010; Leydesdorff et al. 2014) and is listed in the aformentioned Handbook (Sloan and Quan-Haase 2017). The combination of geocoding techniques raises the number of addresses that can be accurately located from 93 to 96%.

After geolocating inventors, each identified location receives a weighting (\(PTW_{i}\)) based on the number of inventors with an address in that location (\(INV_{ij}\)) divided by the number of inventors of the patent (\(INVT_{j}\)) and then summed for all patents (\(j\)). Thus:

$$PTW_{i} = \sum INV_{ij} /INVT_{j}$$

An example of the calculation: a patent with 3 inventors, 2 of whom have an address in ‘Delft, The Netherlands’ would therefore add a weighting of \(2/3 = 0.67\) to the location of ‘Delft, Netherlands’ (51.9995142, 4.2938295).

Once the geographic concentration of patent output has been mapped, clusters are identified using the heat map approach. ‘Heat maps’, use the Kernel Density Estimation (KDE) method (Parzen 1962; Rosenblatt 1956), a spatial interpolation technique that is frequently used to spatially compile data about crime levels, traffic accidents, property values as well as temperature, from which the ‘heat map’-terminology originates. The KDE method appears to offer some important advantages over the cluster identification method used by Alcácer and Zhao (2016). Firstly, Alcácer and Zhao (2016) appears to have gone through a labour-intensive process of assigning patent addresses to particular cities and then combining cities that are in close proximity (less than 40 mi or 64.4 km) into the same cluster. The KDE method skips the need to assign an address to a city as the weightings of nearby locations are combined, thus neighbourhoods, neighbouring cities, adjascent villages or a university campus are automatically interpolated into one big ‘hot spot’.

When applying the KDE method decisions must be made about two important variables: the interpolation range and the concentration threshold for recognizing an area as being of high concentration. The interpolation range can be decided based on several criteria, for example Van Egeraat et al. (2018) uses commuting distance while Alcácer and Zhao (2016) uses 20 mi (32.2 km), with no justification given. Acs et al. (2002) notes that within a 50 mi (80.5 km) distance from the boundaries of a metropolitan statistical area, there is still some positive innovation effect. The distance cited by Acs et al. (2002) is about four times the largest average daily commuting distance of a US city (Atlanta, GA, average commuting distance of 20.6 km) (Kneebone and Holmes 2015).

To identify a suitable interpolation range multiple distances are explored (10, 25 and 50 km). To illustrate the differences between interpolation ranges consider the region around London, United Kingdom, which is a high R&D city-region with multiple poles, including London, Cambridge and Oxford, which are also home to well-known universities. Figure 1 shows the heatmaps for the 10, 25 and 50 km interpolation ranges. A 10 or 25 km range identifies many small but spatially proximate high R&D regions. A 50 km range (right) identifies three large urban regions, from top to bottom: Manchester, Birmingham and London (with London incorporating Cambridge, Oxford and Southampton). The distance from In this research a distance of 50 km is chosen, which identifies high R&D regions at a spatial level that appears to coincide with broader urban agglomerations. The distance between central London and Cambridge, Oxford and Southampton is between 80 and 100 km, which is similar to the knowledge spillover distance Acs et al. (2002) determined to be significant.

Fig. 1
figure 1

Inventive activity heatmaps of London and environs with 10, 25 and 50 km radius

To classify a region as part of an innovation city, the intensity of inventive activity must be within the 97.5th percentile. Using a lower threshold risks creating very large innovation regions which in some cases cover several countries, which would remove the urban and sub-national spatial scale of the innovation city. However even at the 97.5th percentile some very large innovation city-regions are identified, including those surrounding Tokyo, Frankfurt and New York. However because of the empirical support that 50 km is a suitable distance (interpolation range) at which intensive collaborations do occur, the threshold is not raised further, which would remove smaller cities from the sample.

The smallest innovation city-region that the chosen radius and threshold identify has an area of 112 km2 (Sapporo, Japan) and the largest has an area of 100,260 km2 (centered on Tokyo, Japan and encompassing Nagoya and Osaka). Although this large range in sizes challenge the traditional conception of what constitutes a city, there do not appear to be any strong theoretical reasons to exclude large city regions. Furthermore, when the city-region sizes are placed on a logarithmic scale, they are normally distributed.

Two more technical notes concerning the city identification process need to be made. First, in order to track the cities’ development over time, city-regions are identified based on the total patenting activity during the entire study period, from 1997 to 2011, and therefore a constant city-region boundary is used. The constant boundary facilitates comparisons over time but they could remove some of the possible spatial dynamics from the research.

Second, city-regions are named based on the largest city located within the cluster boundaries. City names, coordinates and populations are obtained from the Simple Maps World Cities dataset which is based on public information from the National Geospatial Intelligence Agency (NGIA), United States Census Bureau, United States Geological Survey and the National Aeronautics and Space Administration (NASA). Its free basic dataset contains information about approximately 13,000 large cities worldwide.Footnote 3

Cities identified in an area with no significant population center are subject to additional scrutiny and often lead to the identification of miscoded locations (false positives), a problem that seems to occur in South Korea and Japan where 11 miscoded locations are identified, including Daejeon (South Korea), Yokkaichi, Kurashiki, Nara, Sendai, Kanagawa and Tochigi (Japan).

Patent activity heatmaps for Europe, the United States, Northeast Asia and Australia and New Zealand are included in the “Appendix”.

3.2 R&D expenditure estimation

R&D expenditure is chosen as the indicator to compare global city-regions because it comes closest to describing the real volume of innovation activity that is taking place. Other options, such as directly comparing patent output, suffer from a number of downsides such as different patenting frequencies between sectors and countries, which can lead to over- and under-estimations of R&D activity. R&D expenditure is also a strong predictor of future economic growth (Marković et al. 2017).

R&D expenditure and patent output are closely correlated (Hagedoorn and Cloodt 2003; Lanjouw and Schankerman 2004; Milovančević et al. 2017; Squicciarini et al. 2013) and therefore patents are a suitable proxy for estimating R&D expenditure. International R&D expenditure statistics are obtained from the UNESCO Institute of Statistics who publish gross domestic expenditure on R&D (GERD) data for a number of countries starting in 1996.Footnote 4

Unfortunately a number of biases and variances exist that influence patent data. Therefore a patent is not equal to a fixed amount of R&D expenditure in every country. In fact, the range is quite large as can be seen in Table 1.

Table 1 R&D expenditure (2005 constant PPP million US$) per patent grant, three periods, selected countries

One reason for this variation is the so-called ‘home bias’. The home bias consists of two parts: the home country’s patents are overrepresented in its national patent database and home country patents are cited more frequently than foreign patents (Bacchiocchi and Montobbio 2010; de la Potterie and De Rassenfosse 2008). Other causes of variation in national patent output relative to R&D expenditure include differences in a country’s sectoral R&D profile, as patenting frequencies vary between sectors (Breschi et al. 2000; Kleinknecht et al. 2002; Toivanen and Suominen 2015) and differences in national institutions and laws (De Rassenfosse and de la Potterie 2009).

Therefore the R&D expenditure (\(EXP\)) in a particular city-region (\(C\)) is estimated using the number of patents generated within the city-region (\(PAT_{C}\)) multiplied by the R&D expenditure per patent (\(EPP\)) of the nation or territory (\(N\)) in which the city-region is primarily located, as shown in the equation below:

$$EXP_{C} = EPP_{N} \times PAT_{C}$$

4 Results

The identification process of high R&D city-regions during three periods (1997–2001, 2002–2006, 2007–2011) provides a global and historical overview of the real worldwide spatial distribution of R&D expenditure. A total of 132 city-regions are identified of which 76 are in North America, 36 are in Europe and 19 are in Asia-Pacific (20 if Israel is included). No high R&D city-regions are identified in South America, Africa or India. An overview of the top-20 city-regions is provided in Table 2.

Table 2 Worldwide high R&D city-regions, top-20 (R&D expenditure in 2005 constant PPP US$, world share)

At a global level the rising R&D expenditure of city-regions in Asia is clearly noticeable among the top-20 city-regions. Seoul, followed by Beijing, Shanghai, Taipei and Hong Kong all experience an increase in R&D expenditure and ranking position. During the 1997–2001 period 3 Asian city-regions are in the top 10. By 2007–2011 Shanghai and Beijing have also entered the top 10, making half of the top 10 R&D city-regions from Asia. City-regions in other parts of the world also experience higher R&D expenditure, but many have a constant or falling share of global R&D expenditure and rank due to the faster growth of R&D expenditure in Asian city-regions.

Top-20 city-regions which experience a fall in world share during the 1997–2011 period include Tokyo, Frankfurt, New York, Boston, Paris, Chicago, London, The Hague, Detroit, Minneapolis, Essen and Washington DC. City-regions outside Asia which increased their share are San Francisco, Los Angeles, Seattle, Moscow and Sydney.

The geography of R&D expenditure growth and relative decline appears to be defined by rising R&D expenditure in Asia and in city-regions along the Pacific Coast of the United States, Australia (Sydney) and Russia (Moscow). Declining top-20 city-regions, in relative terms, are found in Western Europe, Japan and the Northeastern United States.

Despite these changes the largest R&D city-regions are Tokyo and San Francisco during all three periods, with Tokyo experienceing a relative decline and San Francisco growing both in absolute and relative terms.

While the rise of Asian city-regions provides evidence of a global shift, the rise of Asian cluster-regions is partly due to the high concentration of R&D expenditure within Asian countries (Crescenzi and Rodríguez-Pose 2017; Yoon and Park 2017). Of the 19 Asia Pacific city regions in the study 5 are in the top-10 by 2006–2011 (26%) compared to 4 of 76 North American city-regions (5%) and 1 of 36 European city-regions (3%).

To some extent this result can be attributed to the way in which city-regions are identified, which is based on inventor-weighted patent data. Due to the home bias effect and other variances mentioned in the Methodology this may lead to an over-detection of city-regions in the United States (favourable home bias effect) and an under-detection of city-regions in a country such as China, where R&D expenditure per USPTO patent grant is very high (see Table 1, earlier). However when the city-region threshold is raised to a minimum R&D expenditure of $10b in 2007–2011, 5 of 10 Asian city-regions appear in the top-10 (50%) compared to 4 of 29 North American city-regions (14%) and 1 of 19 European city-regions (5%). So the observation of a high spatial concentration of R&D expenditure in a small number of city-regions within Asia Pacific holds true.

4.1 Asia-Pacific

Being home to city-regions with fast-growing R&D expenditure, developments taking place in the Asia-Pacific region are of particular interest. Table 3 shows the R&D expenditure of the top-10 city-regions in Asia-Pacific.

Table 3 Asia-Pacific high R&D city-regions, top-10 (R&D expenditure in 2005 constant PPP US$, world share)

Among the top-10 city-regions, all regions experience increasing R&D expenditure and an increase in their global share with the notable exception of Tokyo and Taipei, whose global share is declining during the 1997–2011 period. Among the top-10 clusters two notable groups of city-regions can be identified: city-regions that are rising in ranking position and those that are falling in ranking position, often despite increasing their world share. Rising city-regions are Beijing, Shanghai and Hong Kong. Falling city-regions, in relative terms, are Taipei, Singapore and Kaohsiung. Tokyo should also be considered a falling city-region as it is losing global share but it maintains its first position within Asia-Pacific (and globally).

Based on these developments there appears to be a shift of R&D expenditure towards the 3 Chinese city-regions, with Beijing and Shanghai being closely followed by Hong Kong. If this growth trajectory continues it is concievable that Beijing will overtake Seoul during the next decade as the difference in R&D expenditure between them is about 22% ($37b) in 2007–2011, having fallen from 51% ($33b) during the previous 2002–2006. The gap between Seoul and Tokyo continues to be significant at 181% ($369b), so Tokyo may well remain the largest high R&D city-region for the foreseeable future.

While the rise of Chinese city-regions is significant, there is a notable lack of second-tier Chinese city-regions. While there may be methodological reasons for the failure to detect smaller Chinese high R&D city-regions, Korea, Japan and Taiwan all have a number of second-tier city-regions such as Kaohsiung (Taiwan), Busan, Daegu (Korea), Fukuoka and Hiroshima (Japan). While the large concentrations in Tokyo and Seoul suggest that the three largest Chinese city-regions could grow much further, a second tier of high R&D city-regions may also develop.

4.2 Europe

Viewed globally, European high R&D city regions are in relative decline, being overtaken by large city-regions in Asia, even though in absolute terms R&D expenditure in European city-regions is increasing. Table 4 shows the R&D expenditure of the top-10 city-regions in Asia-Pacific.

Table 4 European high R&D city-regions, top-10 (R&D expenditure in 2005 constant PPP US$, world share)

Within Europe a number of city-regions are increasing their relative position, notably London, Moscow, Munich and Barcelona. Declining in ranking position are Paris, The Hague, Essen and Milan. Frankfurt, which maintains its top position, is declining in terms of its world share. However of the rising city-regions only Moscow and Barcelona are increasing their share of world R&D expenditure to 0.9% and 1.6% respectively. One of the most significant downward movements is by Paris, from 2nd position to 4th and reducing its global share from 2.3 to 1.5%.

Most top-10 city-regions in Europe show only very low growth in R&D expenditure, especially when compared to other parts of the world. Within the European Union Barcelona is the only top-10 city-region with a significant rise in R&D expenditure that exceeds or is in line with the global level of growth. Therefore in relative terms, Europe’s global position in R&D is likely to continue its decline.

4.3 North America

North America is the region in which the largest number (76) of high R&D city-regions is identified, and the 10 largest regions for each period are shown in Table 5.

Table 5 North American high R&D city-regions, top-10 (R&D expenditure in 2005 constant PPP US$, world share)

As in other regions of the world, R&D expenditure in the top-10 city-regions is growing although only two city-regions, Seattle and San Francisco, are experiencing an increase in their global share. Los Angeles and Toronto maintain their global share by growing at close to the global average rate.

As noted earlier city-regions located on the Pacific coast appear to be growing at a faster rate than those positioned along the Atlantic coast (Boston, New York, Washington) and in the Midwest (Chicago, Detroit and Minneapolis). Hence the United States appears to be experiencing a domestic shift in terms of R&D towards the Pacific coast. If current trends persist it is likely that Los Angeles will overtake New York as the second-largest R&D city-region in North America during the coming decade.

5 Discussion

The identifying of high R&D city-regions based on patent data provides an interesting sub-national perspective on changes in the spatial distribution of global innovation activity, and at the same time raises some important technical and theoretical questions.

From a technical perspective, the significant differences in R&D expenditure per USPTO patent grant (Table 1) is a cause for concern as it calls into question the validity of identifying high R&D city-regions based on patent data. In countries with relatively high R&D expenditure per patent such as China, smaller city-regions may not be identified because of the low number of patents. On the other hand too many clusters may be identified in the United States and Japan where R&D expenditure per patent is relatively low.

A second concern is the potential underidentification of city-regions in emerging economies because city-regions are identified using 15 years of accumulated patent data from 1997 to 2011. During the earlier part of that period patenting in emerging economies may have been relatively low, thus lowering the cumulative patenting output based on which city-regions are identified. However during the later part of the period patenting activity was high as these city-regions experience rapid R&D expenditure growth. Therefore identifying city-regions based only on relatively recent data may be another way to avoid underidentification in emerging economies.

Based on these technical concerns the methodology is likely to have failed to identify smaller city-regions in countries with high R&D expenditure per patent such as China ($78 m), Russia ($45 m), Spain ($21 m). Nevertheless city-regions were successfully identified in these countries using the current methodology in these countries and these city-regions feature among the largest in the world.

From a theoretical perspective, the identification of large R&D intensive city-regions is an interesting phenomenon. Are Tokyo-Nagoya-Osaka or Frankfurt-Zurich-Grenoble or New York-Philadelphia really part of a single interconnected city-region or corridor? Or are they a set of spatially proximate but essentially separate cities? Regardless of the answer, the existence of large spatial concentrations of R&D is a notable geographical phenomenon that may be due more to urbanization in general than due to R&D activity specifically.

The existence of (very) large city-regions alongside many smaller city-regions also raises questions about the inequality of R&D activity between major cities around the world. While inequality in research output can be measured in many different ways (Jeon and Kim 2018), the decline of some large city-regions and the growth of others in terms of their R&D expenditure has implications for the global distribution of economic activity and prosperity. The ranking tables in this paper clearly show that in terms of R&D expenditure, the world is highly unequal. The tables also show that city-regions in Asia-Pacific such as Seoul, Beijing, Shanghai and Hong Kong, are showing positive path-creation as they rapidly improve their global position.

Identifying innovation city-regions using patent data also opens the door to further analysis of their development, including changes in the patent co-invention networks or Triple Helix collaboration networks (Leydesdorff and Persson 2010; Leydesdorff and Sun 2009), patent inventor-assignee networks (Bhattacharya 2004), patent quality indicators (Gautam et al. 2014; Hagedoorn and Cloodt 2003; Lanjouw and Schankerman 2004) or changes in the kinds of collaborations and research taking place within a city-region’s innovation system (Park and Leydesdorff 2010; Stek and Van Geenhuizen 2015), including the distance of a city-region from the technological frontier (Toivanen and Suominen 2015) and which important technologies are being researched (Jung 2017). These are all avenues that offer opportunities to gain further insight into the sub-national spatial dynamics of global R&D expenditure and innovation activity.

6 Conclusion

This paper offers a methodology for mapping high R&D city-regions worldwide and its results provide meaningful insight into the changing global distribution of innovation activity, which is shifting towards a number of cities in Asia and three Chinese cities in particular.

The relatively stable position of the largest high R&D city-regions suggest there is considerable path-dependence in the global research and innovation system. However the rise of certain cities in Asia suggests path-creating factors are present there, either due to public policy interventions, foreign investment in R&D, or more likely, a combination of the two (Crescenzi and Jaax 2017; Dicken 2007).

In addition to Asian cities, within Europe and North America there are a number of city-regions with R&D growth that exceeds the growth rate of the broader region in which they are located. These high-growth city-regions include San Francisco, Seattle, Barcelona, London, Moscow and Munich and their innovation development may be of interest to researchers seeking to understand the path-creating factors that influence local R&D activity. As highlighted in the discussion, patent data can also be used to derive other indicators related to networks, organizations and technology, which may provide further insight into these city-regions.

Aside from providing insight into the global geography of R&D, the research also demonstrates some of the challenges of estimating R&D expenditure using patent data. The identification of very large city-regions also raises a notable theoretical question about the existence of very large city-regions which include multiple cities. Do existing agglomeration theories apply to such multi-city ‘meta regions’? Or should the methodology be adapted? These are all questions that can hopefully be addressed in future research as well.