National and organizational patterns of Nobel laureate careers in physiology/medicine, physics, and chemistry

This paper examines the distribution of Nobel laureates in Physiology/Medicine, Physics, and Chemistry across countries and research organizations. We provide basic information about where future laureates received their education and/or conducted their research, then present heat maps depicting country and organizational specialization patterns. In addition, we identify the organizational ultra-elite in science: universities and research institutes that show continuously above-average numbers of future laureates, typically in one career phase. Furthermore, we identify those universities and research institutes that have undergone considerable growth (or decline) in their capabilities for highly innovative research. Also, we compare country-specific profiles with those at the organizational level. Our findings are interpreted in the light of findings from comparative-historical studies.


Introduction
Following the seminal publication by Zuckerman (1977), the Nobel Prizes in Physiology/ Medicine, Physics, and Chemistry have attracted considerable attention in quantitative studies of science, especially with regard to achievement age (Jones & Weinberg, 2011;Redelmeier & Naylor, 2016), the time lag between prize-winning work and awarding of the prize (Becattini et al., 2014;Fortunato, 2014), and the distribution of other science awards and collaboration networks in the years before and after their awarding (Chan et al., 2014(Chan et al., , 2015. Several studies have examined Nobel laureates from a bibliometric point of view, including spillover effects for the citations of laureates' publications unrelated to the 1 3 Nobel Prize (Mazloumian et al., 2011) and differences in citation patterns between landmark papers featuring theory, methods, experiments, or inventions (Heinze et al., 2013;Zhou et al., 2014). Cross-national comparisons have found that Nobel Prizes "can be used to validate bibliometric indicators" (Rodríguez-Navarro, 2011).
More recently, analyses of the population of Nobel laureates have focused on the rise of North America as a global center of science and technology, its subsequent hegemony (Heinze et al., 2019a, b), and how national institutional contexts have shaped the capabilities of universities and research organizations to achieve scientific breakthroughs (Heinze et al., 2020). These studies found that North America, in particular the United States, replaced Germany as global scientific center by the 1920s, that its hegemony was consolidated in the 1970s, and that although its leadership has come under pressure since the 2000s, a new global powerhouse is not in sight. Furthermore, it was shown that national contexts exerting weak institutional control are associated with organizational capabilities to achieve scientific break throughs. More specifically, countries with weak institutional control (United States, United Kingdom) have produced many more Nobel laureates, controlled by population size and by GDP per capita, than those exerting strong control (France, Germany).
However, much less attention has been paid to the distribution of laureates across universities and research institutes (Schlagberger et al., 2016). There is no comprehensive map of the organizational field in which future laureates were educated, conducted their prizewinning research, and worked when awarded the prestigious prize. Furthermore, rankings that include Nobel laureates, such as the "Academic Ranking of World Universities" (commonly known as the Shanghai Ranking), do not consider where future Nobel laureates were educated or conducted their prize-winning research, but focus solely on information at the time when the Nobel Prize was awarded.
This paper contributes to the ongoing debate on Nobel Prizes in three ways. First, it provides basic information about the universities and research institutes where future laureates received their education and/or conducted their research. Second, we present heat maps depicting both country and organizational specialization patterns, focusing on those universities and research facilities that underwent considerable growth (or decline) in their capabilities for highly innovative research. Third, we identify the organizational ultra-elite in science: universities and research institutes that show continuously above-average numbers of future laureates, typically in one career phase, but sometimes also across two career phases.

Data and method
This paper examines the distribution of Nobel laureates in Physiology/Medicine, Physics, and Chemistry across national and organizational boundaries. We distinguish three career stages: (1) the university where future Nobel laureates received their highest academic degree (HD), (2) the university or research organization where they performed their prize-winning research (PWR), and (3) the university or research organization where they were employed at the time of the award (NP). Our analysis covers 100 years  of an existing dataset (Heinze et al., 2020;Heinze et al., 2019a, b) that includes the entire time period 1901-2020 (120 years). The primary data source was the Nobel Foundation's website (www. nobel prize. org), enriched by data from secondary sources, such as the American Institute of Physics, American National Biography, Encyclopedia Britannica, Howard Hughes Medical Institute, National Academy of Sciences, Notable Names Database, and Royal Society.
First, we provide basic descriptive information about both the laureate population and the top 50 universities and research organizations (Table 1). Second, we present heat maps based on calculations of the specialization index RESP (see below). This index is calculated using the Activity Index (Narin et al., 1987;Piro et al., 2017), that captures the extent to which certain entities are specialized in certain activities (Formula 1). AI values lower than 1.0 indicate a negative specialization (below-average scores), and AI values greater than 1.0 a positive specialization (above average). A verbal expression of the AI, applied to Nobel laureates, is given in Formula 2.
Formula 1: General formula of the Activity Index (AI) Formula 2: Specific AI applied to career phases of Nobel laureates The AI's value range of [0.0, +∞] lacks an upper limit. Available indexes that are symmetrical both above and below the expected value include, for example, the Revealed Symmetric Comparative Advantage (RSCA) to capture country-specific technical specialization (Laursen, 2000(Laursen, , 2015. Furthermore, the Relative Specialization Index (RSI) has been used to map profiles of Scandinavian universities (Piro et al., 2011(Piro et al., , 2014(Piro et al., , 2017. Interpreting RSCA and RSI is easier than interpreting the AI, due to their symmetrical value range of [−1.0, +1.0] : values lower than 0.0 indicate negative specialization; values greater than 0.0 indicate positive specialization. We use a modified version of the RSCA/RSI index that was introduced by Grupp (1994Grupp ( , 1998. Its value range is [−100.0, +100.0] with an expected value of zero (Formula 3). This index, which we call RESP (for "Index of Relative Specialization") is different from RSCA/ RSI in that it is based on the hyperbolic tangent. Consequently, its curve is steeper and reaches the upper limits of its value range more quickly than RSCA/RSI. Hence, RESP-based heat maps are richer in contrast, and present more visibly specialization profiles. For further details, see Heinze et al., (2019a, b)).
Formula 3: Relative Specialization (RESP) 1 3 Note: The subindices i and j of the AI are omitted for the sake of simplicity. As mentioned, the RESP has two distinct ranges: the range above 0 to 100 means performance above average, and the range below 0 to −100 means performance below average. Graphically, these three points, −100, 0, and 100, can be indicated using three colors. We created heat maps to display RESP values using the color blue fading to yellow representing values between −100 and 0, and yellow fading to orange representing values between 0 and 100. Thus, in the heat map shades of blue stand for below-average performance, shades of yellow stand for (almost) average performance, and shades of orange stand for aboveaverage performance.
Regarding all figures, graphical processing of the data was done by use of R (4.1.0; R Core Team, 2021) and the R package ggplot2 (3.3.4; Wickham, 2016).

Results
Our dataset contains the entire time period 1901-2020 (120 years). It includes n = 360 Nobel Prizes across n = 624 Nobel laureates, of which n = 186 laureates are in Chemistry, n = 216 laureates are in Physics, and n = 222 laureates are in Medicine/Physiology. In principle, n = 624 laureates would generate 1.872 career events in HD, PWR, and NP (624 × 3 = 1.872). However, we consider career events in the time frame 1901-2000 only, as explained below. This means some career events, particularly for the early twentieth century Nobel Prizes, reaching back into the nineteenth century, are not considered in our analysis. In addition, we do not consider NP events after 2000. Therefore, the number of career events is reduced at the "left" and "right" ends of our dataset. In total, we consider 1.578 career events (HD, PWR, NP) and 341 organizations from 100 years , the latter including universities, public research institutes, and private research laboratories (Table 1). Thereby, 465 career events refer to laureates in Chemistry, 546 career events to laureates in Physics, and 567 career events refer to Physiology/Medicine. PWR was the most frequent career step, with around 20 events more than HD and nearly 100 events more than NP. The different number of events results particularly from laureates winning the Nobel Price after 2000. In this case, we consider them as persons (as we consider all laureates from 1901 to 2020), but not their NP event (as we consider only events from 1901 to 2000). Therefore, column "NP" in Table 1 is identical to the number of laureates for the period 1901-2000, whereas both HD and PWR refer to the number of laureates for the period 1901-2020. Consider also that some HD and PWR events occurred before 1901: these events are excluded from our analysis. That column "HD" in Table 1 has fewer events than column "PWR" both in Chemistry and Physiology/Medicine indicates that it was not possible to retrieve HD-related information for all laureates.
The different distribution of career events is mapped in Fig. 1. The strongly decreasing number of HD events towards the end of the timeline is due to the fact that the PWR and HD events in the period 1971-2020 will be fully known only when future NP laureates have entered the dataset (starting in 2021). In other words, the dataset is complete in regard to the NP events for the entire time period (1901-2020), but it is incomplete for PWR and HD. The latter two will be completed with information about recipients of the Nobel prize starting in 2021. Therefore, we limit our analysis to the period with enough HD events . As the RESP is a share of shares, this fluctuation of the group size of each category plays only a minor role in calculating the RESP and-more importantly-in interpreting the RESP. It is important to note that RESP values are calculated separately for each time period.
The four countries with the most career events (1.229 or 78%) and most organizational entities (232 or 68%) are (in descending order) the United States (729 career events & 120 orgs), United Kingdom (243 career events & 40 orgs), Germany (178 career events & 51 orgs), and France (79 career events & 21 orgs). Table 2 displays all the countries. We calculated RESP values based on all countries in the database (n = 30), using 20-year periods. Figure 2 displays the results for the four countries with the most career events and organizational entities.
Three results are noteworthy. First, the United States shows a decreasing specialization in educating future laureates (HD): compared with other countries, the United States increasingly relies on foreign-born and foreign-educated scientists. This result corroborates findings from Stephan and Levin (2001). At the same time, the United States has become more specialized in the later career phases (PWR, NP), indicating its growing attractiveness over the twentieth century as a work environment for future laureates (Heinze et al., 2020). These developments are especially pronounced in the medical sciences. Remember that the RESP is a relative index. A decreasing specialization in educating future laureates (HD) does not rule out the possibilities of stagnating or even rising numbers in educating future laureates. It means that the share of HD career events among all career events  1901−1910 1911−1920 1921−1930 1931−1940 1941−1950 1951−1960 1961−1970 1971−1980 1981−1990 1991 Second, Germany has almost the opposite specialization to that of the United States: it shows an increasing specialization in the education of future laureates (HD), whereas its attractiveness as work environment for later career phases has decreased in the second half of the twentieth century, especially in Medicine/Physiology. A similar but somewhat less pronounced pattern can be observed for France. Here, the education of future laureates (HD) has increased over time in Chemistry, whereas a decreasing pattern can be observed for the third career period (NP), which typically is characterized by below-average scores in the three disciplines starting in the 1940s. Third, the United Kingdom shows stability in the two later career phases: its specialization in PWR and NP is visible for the entire twentieth century, especially in Medicine/Physiology where it matches the pattern of the United States. Similarly, the United Kingdom resembles the United States with its decreasing emphasis on educating future laureates (HD), particularly in Chemistry and Physics. The first and second results are in line with comparative-historical evidence that highlights the declining hegemony of German universities in the early twentieth century, coupled with an upswing of research universities in the United States (Ben-David, 1960, 1971. Ben-David explains this development both with regard to internal organizational features in North American universities that were more conducive to the growth of new research fields (compared with those in Germany), and the more pronounced level of decentralized competition in the American university system, particularly between public and private universities (compared with exclusively public higher education in Germany). In addition, the Holocaust and the emigration of Jewish scientists from Nazi Germany to the United States accelerated the shift of the global center of science to North America (Moser et al., 2014;Waldinger, 2010). Also, the first and third results are in line with comparative-historical evidence suggesting that national contexts in United States and the United Kingdom exerted weak institutional control on universities and research organizations, and thus facilitated highly innovative research capabilities in the twentieth century (Hollingsworth, 2004(Hollingsworth, , 2006, a finding that is reflected also in data on institutional context in the twenty-first century (Pruvot & Estermann, 2017).  1901−1920 1921−1940 1941−1960 1961−1980 1981−2000 1901−1920 1921−1940 1941−1960 1961−1980 1981−2000 1901−1920 1921−1940 1941−1960 1961−1980 1981−2000 1901−1920 1921−1940 1941−1960 1961−1980 1981 In sum, the heat maps of the United States and Germany reflect the migration of scientists (and laureates) after their highest degree (HD) to the disadvantage of Germany, particular in the time period 1941-1960. In this context, the question arises how the share of all three categories (NP, PWR, HD) of all countries develop. Figure 3 provides results for this question. Figure 3 shows that in the first time period (1901)(1902)(1903)(1904)(1905)(1906)(1907)(1908)(1909)(1910)(1911)(1912)(1913)(1914)(1915)(1916)(1917)(1918)(1919)(1920) no country was dominant. The peer group of our four countries US, UK, FR, and DE all have a share of 10 to 30% regarding all three disciplines. But over time the US has dominated, and since the time period 1941-1960 every second observation of our data set is related to the US, meaning the US is dominating more and more the average share of HD, PWR, and NP in all observations. This, in turn, has a massive impact on the RESP, because RESP values align themselves with the overall average. Basically, this means, in the beginning RESP values align themselves to an average built by all countries together, but in later years this average is dominated by the US, meaning that RESP values are aligned to the values of the US. We can detect this very well in Fig. 2 1901−1920 1921−1940 1941−1960 1961−1980 1981−2000 1901−1920 1921−1940 1941−1960 1961−1980 1981 the US, which are more or less yellow in the last time period for all disciplines. Yellow means the RESP values are near the average share of the observations. If the RESP is aligned to the average of all observations and if the US is dominating this average by its share on all observations, then the US values of the RESP should be near the average. We turn now to the organizational level. Given the results above, it is certainly not astonishing that the Top-10 universities are from the United States (8) and the United Kingdom (2). Equally important, however, is the considerable variation among the Top-50 with regard to their representation in the three career phases (Table 3). Therefore, we probed organizational specializations in Nobel laureates' careers. For this purpose, we calculated RESP values for all organizations in the database (n = 341), using 20-year periods. Figures 4 and 5 display results for the Top-20. We also checked robustness by calculating RESP values for those organizations with more than 2 career events (n = 180) and for those with more than 10 career events (n = 37). Overall, specialization patterns were very robust. Therefore, we focus here on results for all organizations in the database. In our view, the following results are noteworthy.
First, there is some stability in single career phases over time, most notably in the education of future Nobel laureates (HD). Here, in at least four (out of five) consecutive periods, the following universities show a constant positive specialization over the twentieth century across all three disciplines (first column, Figs. 4 and 5): Cambridge, Harvard, Columbia, Caltech, MIT, and Göttingen. Among those with a stable positive specialization in later career phases in at least four (out of five) consecutive periods are Bell Labs (PWR), Rockefeller (NP), Caltech (NP), and London (NP). Clearly, constant positive specialization in PWR and/or NP requires considerable resources to build and maintain capabilities for highly innovative research. Borrowing a term coined by Zuckerman (1977), it is fair to call those universities and research institutes with constant positive specializations in either of the three career phases the organizational ultra-elite in global science. To be sure, this ultra-elite constitutes a very thin layer. Note that the Rockefeller Institute (later the Rockefeller University) stands out as the single entity with most above-average scores in 20-year periods in PWR and NP combined, highlighting its particular status among the organizational ultra-elite (for historical details on the Rockefeller Institute, see Hollingsworth, 2004).
Second, there is no single university or research organization in either Figs. 4 or 5 with positive specializations in all three career phases in one (or more) 20-year period(s). More specifically, every university has at least one 20-year period during which no laureate earned their highest academic degree (HD), performed price-winning research (PWR), or was employed there at the time of the award (NP). This result is expected. For example, if the share of NP for all observations at an institution is above average, then any of the other two shares (HD, PWR) must be below average, since all three shares always result in 100% of the observations. However, we find examples that come close to the case, where all three shares are near their average shares (first column, Figs. 4 and 5): MIT (1961MIT ( -2000, Cambridge and Munich (1961-80), Columbia andOxford (1941-60), andHU Berlin andGöttingen (1901-20). Note that aboveaverage scores in the PWR and NP career phases indicate extremely high capability for innovative research. In addition, consider that some universities underwent a considerable change in their respective capabilities. For example, compare the first (1901)(1902)(1903)(1904)(1905)(1906)(1907)(1908)(1909)(1910)(1911)(1912)(1913)(1914)(1915)(1916)(1917)(1918)(1919)(1920) and fifth (1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999)(2000) periods for two of the above-mentioned universities: HU Berlin (decrease) and MIT (increase). Also, most often several decades lie between HD and NP, so both the HD and the PWR observations are concentrated before 1980. 1 3 Third, it is noteworthy that some organizational patterns are shaped by one or sometimes two disciplines (Figs. 4 and 5). Consider for example Princeton University, whose overall pattern is strongly shaped by Physics, with neither Chemistry nor Medicine/Physiology appearing to play an important role. In contrast, the Rockefeller University's overall pattern is clearly shaped by Chemistry and Medicine/Physiology.
Fourth, few universities have changed their profile in a given career phase in one particular direction (Figs. 4 and 5). Among those with growing specialization (over at least four consecutive periods) are Cambridge (PWR, Chemistry, 1921-2000 and Columbia (PWR, all disciplines, 1921(PWR, all disciplines, -2000; conversely, among those with decreasing specialization are Princeton (HD, Physics, 1901-2000 and Göttingen (PWR, Physics, 1901-1980. This suggests that at Cambridge and Columbia, some intra-organizational process of building up capabilities to conduct highly innovative research took place, whereas in Princeton and Göttingen we assume that some process of downscaling of such capabilities occurred, or that processes of upscaling the share of NP occurred, automatically leading to a downscaling of HD and PWR (always remember that all three categories sum up to 100% at one institution). How such processes unfolded and why is beyond the scope of this paper but could be examined from a historical perspective.
In the light of the above-mentioned country-specific patterns, we probed whether they are reflected at the organizational level. Our analysis shows that this is not the case. First, there is no single university or research organization that roughly matches all three national specializations over time. Rather, we find some examples where specializations in one career phase (and sometimes two) are similar. Three examples to illustrate this point are (1) Cambridge mirrors the UK pattern in PWR (stable positive), (2) Princeton reflects the US pattern both in HD (decreasing) and NP (increasing), and (3) Göttingen develops a profile similar to that of Germany both in HD (increasing) and in NP (decreasing). Second, there are several universities that show patterns quite different from their national one. Two examples follow: (1) Cambridge is less specialized in the third career phase (NP) than the United Kingdom in general. Although it has educated an above-average number of future Nobel laureates and provided them with attractive working conditions, Cambridge retains them less often than does the UK as a whole. (2) Similarly, although Columbia follows the (increasing) specialization of the United States in the first two career phases (HD, PWR), it has a weaker profile in NP compared with the national value.   that show continuously above-average contributions in the education and employment of future laureates. In comparison with commonly used rankings, such as the Shanghai Ranking or the Leiden Ranking, that provide information about top-performing universities in the early twenty-first century (for details see Moed, 2017), our analysis covers the entire twentieth century, and thus gives insights into the building and maintenance of capabilities for highly innovative research. Our dataset can be analyzed further. Besides the RESP values for countries and for particular institutions, their interaction can be explored: that is, how do RESP values change in the context of world-wide versus country-wide consideration? Also, for most ultra-elite organizations, data on financial resources and scientific staff are available. In addition, for universities that shift their emphasis to educating future laureates (high HD RESP values) such shifts could be examined in regard to whether they are reflected in absolute values or just a result of the rising dominance of the US and/or rising PWR and NP values at other universities. The link between RESP values and both financial and human resources could expose more information about ultra-elite research organizations. Finally, we used descriptive analyses for presenting the RESP values here. Of course, RESP values can be analyzed statistically, too.