1 Introduction

Understanding whether industrial invention depends on science remains an important topic for research and public policy, and more specifically, understanding how science or the interactions between science and technology may impact the rate and direction of emerging technologies. Nightingale (1998) distinguishes between research as science focused upon the creation and validation of generalizable knowledge whereas practice as technology focuses upon solving specific practical problems. Earlier studies addressing how to categorize the technological impact of new knowledge make a key distinction between the degree of novelty involved and the degree of impact (Trajtenberg 1990; Kaplan and Vakili 2015; Wang et al. 2017). Science is represented by the degree of novelty while technology presents the economic impact and usefulness of an idea. With a focus on recombinations of technological components and their interdependencies, the literature analyzing patents and patent citations has studied how different search strategies, as contrasted with the landscape of all possible searches, can affect the type of industrial invention which later occurs (Fleming and Sorensen 2001). Fleming and Sorensen (2004) propose that whereas technology search occurs through incremental steps with independent components, science provides advantages for distant search (for more breakthrough inventions), especially where there are highly coupled components. In contrast, Kaplan and Vakili (2015) argue that breakthrough inventions may require both narrow recombinations of application areas as well as more distant search. McKelvey and Saemundsson (2021) propose that generating both new scientific and technical knowledge can be conceptualized as an evolutionary problem-solving process, where there are often ambiguities—or gray zones—between research and actual use in practice. Hence, in addition to the literature considering search in relation to the rate and direction of technological change, we also need to consider the time required for new ideas to be implemented in society. Extensive studies of science and technology demonstrate a time-lag between the appearance of a novel idea in science and the socioeconomic impact of that new knowledge through technology or wider use in society (Salter and Martin 2001). We therefore focus upon the combination of time-lag through delayed recognition with impact. More generally, the concept of sleeping beauties in science is attributed to Van Raan (2004), to highlight both delayed recognition (sleepers) and high impact (beauties), which we here apply to patents.Footnote 1 We hence apply and develop this concept with regards to technology, in order to better understand potential breakthrough inventions in an emerging technology. Our aim is to study whether innovation depends on long-term patterns of interactions in technology and science, using patents in nanotechnology. Specifically, we are interested in what kinds of links to science matter and what influences the delayed recognition and high impact of a technological invention, for this emerging technology.

Knowledge needs to be continually recombined, in order to be useful for new scientific outcomes and for applying technology to solve problems. Since there is no direct literature on the factors affecting the probability of a patent becoming a sleeping beauty in nanotechnology, we combined various literature streams in our deductive approach to build our hypotheses below. To do so, we adapted some concepts from scientometrics used for studying scientific fields and communities in order to study technology (as well as linkages between science and technology). Parallelisms are common between scientometrics and patentometrics (Narin 1994; Meyer 2000), and we followed this tradition.

Being able to recombine different types of knowledge, and in such a way as to stimulate emerging technologies matters for public policy. Although we do not directly address the topic of the antecedents and consequences of technological specialization within regions, we are aware of the vast number of studies within regional science which consider technological development (Henning and McKelvey 2020; Neffke et al. 2011; Bathelt et al. 2004; Glanzel and Garfield 2004; Hicks et al. 2001; Beise and Stahl 1999). The concept of related variety captures the details of how local knowledge bases in relation to technological and industrial specialization affect the long-term development of regions (Neffke et al. 2011; Juhász et al. 2021). Asheim and Coenen (2005) conceptualized different regional knowledge bases as being either predominately based on analytical (scientific) or synthetic (industrial) knowledge. Bathelt and Glückler (2003) and Bathelt et al. (2004) represent the relational turn in economic geography, in order to understand how knowledge, geography, and networks interrelate in explaining knowledge creation and diffusion. For understanding technology-specific attributes of RandD collaboration networks, Neuländtner and Scherngell (2020) focus on the geographical and relational effects of networks in Key Enabling Technologies (KETs), and find that for all technologies, varying degrees of network effects compensate for some geographical boundaries; nanotechnology specifically has more localized but some inter-regional links. We hope our study helps to inform a limited part of this debate, through the policy implications for science-technology linkages and by following the previous literature which considers nanotechnology as an important emerging technology.

To study long-term patterns, we developed the metaphor of sleeping beauties for patents. We do so by combining three strands of literature, namely patent citations as indicators of innovation; the concept of sleeping beauties in scientometrics; and specific studies of nanotechnology as an emerging technology The metaphor has previously been applied to understanding science through scientific publications (van Raan 2004; Dey et al. 2017), delayed recognition of application-oriented sleeping beauties (van Raan 2015), and references to those scientific papers in patents (van Raan 2015, 2017). In studies of science, “sleeping beauties” refer to scientific papers that “sleep” (receive no citations) until they are “woken up” by a citing paper, and then receive many citations, indicating a large impact on subsequent science. Very few studies have applied this to patents. Hou and Yang (2019) applied the concept to graphene patents in China, in order to identify different patterns of being “awoken.” We propose that the metaphor of sleeping beauties is useful in patents to disentangle the differing concepts of delayed recognition (sleepers) and high impact (beauties), which underlie the combination of the two as emerging technologies that are potential breakthrough inventions.

Taking patents as indicative of technology, we identify patents which have both delayed recognition and high impact. We do so in the empirical area of nanotechnology, because it has been identified as an emerging technology with extensive science-technology interactions (Meyer 2001; Meyer et al. 2010; Bourelos et al. 2017) and identified by the European Commission (2012) as a KET, or Key Enabling Technology. We do so based on data extracted from the European Patent Office (EPO) dataset PATSTAT for the years 1956–2018. We compare the population of nanotechnology patents with at least one citation, with the population of all patents (e.g., whole population) for these years, in relation to our concept of sleeping beauties in patents. Based on our literature review, we develop two hypotheses about two different types of linkages to science, to nuance the discussions. We expect that both direct and strong science-based linkages as well as indirect and more diverse science-based linkages will positively affect sleeping beauties in nanotechnology. Based upon our analysis, we reject both hypotheses, because we find that each of these types of linkages positively effect high impact but negative effect delayed recognition. Contrary to expectations, we find that the science-based patents mainly have an earlier impact and possibly a more direct impact on industrial invention. Control variables of IPC application class and company ownership do matter, which suggests that companies are active in combining multiple channels and sources of knowledge into industrial inventions. One contribution is to propose that non-patent literature should not be considered a proxy for science linkages in general, but instead this reflects a search amongst various types of codified as well as informal technological and scientific knowledge. We propose that references to the non-patent literature can be conceptualized as a variety of informal and temporary ways of searching across a broader range of scientific fields and industrial application areas. Further research on these topics can elucidate a more nuanced understanding of how and why combining and recombining both scientific and technological knowledge may impact the combination of delayed recognition and high impact.

2 Literature review and hypotheses

2.1 Nanotechnology patents as indicative of emerging technologies

There is a vast amount of literature on science-technology interactions and patents are often used as indicators of technology, as an output measurement of industrial invention, and to identify linkages between science and technology through both forward and backward citations.Footnote 2

This literature informs our understanding of the potential linkages between science and technology in emerging technologies. Patents that reflect inventions resulting from basic research will have broader applications than inventions stemming from corporate research (Trajtenberg et al. 1997). Veugelers and Wang 2019 find that novel science is relevant for much technology, and that high scientific risk may lead to high gains compared to more incremental steps for technology. Other studies have verified a positive relationship between scientific links and patent value (Carpenter et al. 1981; Reitzig 2003; Nagaoka 2007). Forward and backward patent citations have been extensively used as indicators of different types of value, in terms of technological impact and market value (Harhoff et al. 2003), as well as novelty seen as scientific and technological value (Albert et al. 1991). Patent citations are associated with higher value of innovations within companies (Trajtenberg 1990) and stock market value (Jaffe and Trajtenberg 1996).

Although patent citations have been criticized as a “noisy” indicator (Jaffe et al. 1998), the extant literature uses patents and citations to study many questions, such as the relative degree of basic or applied knowledge, and the degree of spillovers and technology transfer (Lissoni and Montobbio 2015); as an indicator of knowledge spillover between university research and innovation (Jaffe 1989; Acs et al. 1991); and as a proxy for geographical knowledge flows and spillovers (Glanzel and Garfield 2004; Hicks et al. 2001; Beise and Stahl 1999). When comparing citations in scientific publications to citations in patents, we need to bear in mind that numerous backward citations in patents are inserted by examiners (Azagra-Caro and Tur 2018), but the majority of citations inserted by examiners depends on the specific patent office, e.g., EPO versus USPTO (Alcacer and Gittelman 2006; Criscuolo and Verspagen 2008). An interesting pattern about patent citations is that self-citations and “unpredictable” citations have a stronger effect on patent value (Jaffe and Trajtenberg 1996), suggesting it would be valuable to study long-term patterns of citations. This provides a motivation for our study, in that we disentangle the degrees of novelty (science) and technological impact (technology).

Hence, in order to assess delayed recognition, we considered the time-lags derived from our analysis. To assess high impact, which we consider the high technological relevance of a patent, we used forward citations to the patent. Our approach follows a long line of studies verifying that the technological value of a patent is correlated with the number of forward citations the patent receives (Trajtenberg 1990; Harhoff et al. 2003; Hall et al. 2005; Sterzi 2013; Albino et al. 2014), even though this does not lead to high economic value in every case (Azagra-Caro et al. 2017). The reason for assuming such a relationship is that patents have to include all prior art in their references, in both the patents and non-patent literature. A patent cited in many subsequent others is thus considered to have a high technological relevance, since it is thus related to many later industrial inventions. Hence, we used forward citation to represent recognition of the novel technological idea for subsequent technology and industrial inventions. We differentiated between references to patents and to the non-patent literature (Callaert et al. 2006).

Moreover, using this type of analysis centered around patents, we chose to study nanotechnology as the context of an emerging technology. Nanotechnology is an emerging, interdisciplinary field which is postulated to lead to breakthrough innovations across disciplines by serving as a general-purpose technology in many areas of industry and application (Youtie 2008b; Li et al. 2007a). Nanotechnology is closely related to basic and applied research, similar to earlier fields of biotechnology and genetic engineering in terms of strong linkages between universities and firms (Genet et al. 2012; Mangematin and Walsh 2012; McKelvey 1996; Meyer 2000, 2006; Meyer-Krahmer and Smoch 1998; Rothaermel and Thursby 2007; Zucker et al. 2007; Loveridge et al. 2008; Bainbridge 2007; Schmidt 2008; Porter and Youtie 2009; Rafols and Meyer 2007; Schummer 2004). Moreover, there has been an exponential growth in nanotechnology patents over the last three decades, seen both in USPTO and EPO data (Chen et al. 2008; Li et al. 2007b; Youtie et al. 2008a). Nanotechnology has been characterized by intensive and growing patenting activity (Dang et al. 2010; Shapira and Wang 2009; Thursby and Thursby 2011; Guan and Ma 2007; Alencar et al. 2007). Thus, even though an emerging technology is by definition a small niche, the expansion outlined above suggests this context may be relevant to many actors.

During the years studied, nanotechnology is global and interdisciplinary. The global diffusion and changing patterns of dominance make this technology relevant for technological impact, where the USA was the initial champion of nanotechnology patents, followed by the EU (Germany and France especially) then Japan, but over time especially South Korea and China have had a high increase in nanotechnology patents (Chen et al. 2008; Huang et al. 2011; Leydesdorff 2008; Li et al. 2007a; Hullmann and Meyer 2003; Youtie et al. 2008a). The literature suggests that nanotechnology is an interdisciplinary field positioned between physics and chemistry, with strong elements from other scientific fields such as life sciences and material science (Leydesdorff 2008; Leydesdorff and Zhou 2007; Hullmann and Meyer 2003; Huang et al. 2011). Previous research also suggests that the prominent scientists in nanotechnology also have a higher probability of becoming prominent inventors with many, well cited patents (Bourelos et al. 2017). These characteristics mean that nanotechnology may be relevant for many industrial applications and breakthrough inventions, with many underlying scientific fields and technological components.

Although nanotechnology is clearly a field with strong linkages between science and technology, the current literature examines a vast array of more direct and indirect linkages which make it interesting to open up this debate further. Science and technology linkages are more prevalent in nanotechnology than in other scientific fields (according to Finardi 2011; Wang and Guan 2010; Meyer 2000, 2001, 2006; and Meyer et al. 2010). Multiple actors are involved. Universities and firms have been highly involved in patenting activity in nanotechnology (Ozcan and Islam 2014; Li et al. 2007b; Mowery 2011; Fiedler and Welpe 2010; Barirani et al. 2017; Pilkington and Meredith 2009). Established companies like IBM, Kodak, L’Oreal, have worked with nanotechnology innovations since the early stages as well as entrepreneurial firms (Mangematin and Walsh 2012; Rothaermel and Thursby 2007; Chen et al. 2008; Li. et al. 2007a, b). Some literature suggests that knowledge inflows between universities and industry have increased over time (Jung and Lee 2014) as well as technology transfers (Bajwa et al. 2013; Bhattacharya et al. 2012; Chang et al. 2010; Gorjiara and Baldock 2014; Milanez et al. 2014; Tang and Shapira 2011).

Therefore, the specific empirical context of nanotechnology leads us to expect science-technology linkages, but there is a gap that we address, namely disentangling different types of science-technology interactions, and particularly their importance in relation to delayed recognition and high impact.

2.2 Why science may affect delayed recognition and high impact of technology

Firstly, we suggest that having a direct and strong science-base will more likely lead to future, potential breakthrough inventions in nanotechnology. In line with Fleming and Sorensen (2004), we propose that direct linkages to science should be a more extreme case of their argument that science helps guide radical search, especially with interdependent technologies.

Based on the existing literature, we suggest that linkages through a direct and strong science-base will more likely lead to delayed recognition and high impact of patents. The total amount of forward citations received may differ regarding patent ownership, and the literature shows that university-owned patents are more important for impact than firm-owned in the USA (Henderson et al. 1998; Sampat et al. 2003; Bacchiocchi and Montobbio 2009) and in Europe (Crespi et al. 2010). Similarly, having at least one academic scientist among the inventors generally results in more citations (Czarnitzki et al. 2012), and also compared to corporate-only patents (Ljugnberg and McKelvey 2012). Thus, the type of patent ownership can influence its value and is correlated both with the amount of forward citations received and with the pattern of forward citations (Sapsalis et al. 2006). The argument, verified by empirical studies, is that firm-owned patents have higher short-term value, while university-owned patents have higher long-term value, depending on when cited (Czarnitzki et al. 2012; Sterzi 2013). Short-term value means that the patents are cited soon after the publication year, while long-term value means the patents are cited some time after the publication year. Coronado et al. (2017) argue that regional economic specialization significantly affects university-owned patents in RandD intensive sectors, but not in low and medium tech sectors. Furthermore, Ljungberg and McKelvey (2012), in a study of Sweden-based firms, show evidence that firms’ patents which include university scientists are on average more likely to cite the non-patent literature than non-academic ones, suggesting firms that work directly with academics are closer to the science base.

We interpret this literature as stressing the importance of having direct and strong science-base linkages to the technology for signaling technological impact, which in turn may lead to potential future breakthrough inventions in nanotechnology. Our proxy is university ownership (rather than firm ownership) of the patents, because these patents are presumably based on research results at the university, ergo directly the result of science. Our reasoning is that making recombinations of technology that depend on highly coupled, interdependent components may require a direct link to science in order to understand potential search directions. Thus these inventions may take longer, because they are further away from the current localized search.

H1

Technologies originating from a direct and strong science-base have a high probability of delayed recognition and high impact, e.g., sleeping beauties.

Secondly, we propose that an indirect and more diverse link to the science-base will more likely lead to future, potential breakthrough inventions in nanotechnology. Kaplan and Vakili (2015) are relevant here, where we propose that having many but more diverse linkages could facilitate breakthrough inventions through narrow recombinations of application areas with more sporadic, distance search.

The literature interprets non-patent literature as indicative of science, given that many various types of scientific publications such as articles and conference papers are included, usually up to 40–50% (Callaert et al. 2006; Sampat et al. 2003). As to our interest in timing, Sampat et al. (2003) find a strong time-lag, suggesting delayed recognition should be taken into consideration when evaluating impact.

Our proposal is that the non-patent literature in backward citations do not represent science in general, but a particular type, namely indirect and more diverse linkages. Hence, we need to further explore how to conceptualize what types of science are represented by the non-patent literature. Callaert et al. (2006) have carefully examined the patent and non-patent literature at the EPO and USPTO, and consider the non-patent literature indicative of scientific linkages. Delving deeper into their study, we find that of the non-patent literature, 55–64% are journal articles and the remaining 45–36% are what they call a type of science, like conference proceedings, industry-related documents, reference books, databases, and technical reports, (Callaert et al. 2006: 14). Our interpretation is that these sources indicate a wide range of different types of scientific knowledge and application areas, similar to the external knowledge sources and role of universities as reported in surveys of firms (Klevorick et al. 1995; Laursen and Salter 2004). Thus, besides a wide range of sources and types of knowledge, differing source of knowledge from channels helps us better understand how and why informal and temporary searches across many fields of knowledge may lead to different types of relevant industrial inventions and innovations, as argued in Gifford et al. (2021). Therefore, we propose that non-patent literature represents a more indirect and more diverse link to the science-base.

H2

Technologies originating from an indirect and more diverse science-base have a high probability of delayed recognition and high impact, e.g., sleeping beauties.

3 Data and methodology

We broke down the sleeping beauty metaphor into the two different components: delayed recognition (being a sleeper) and high impact (being a beauty) in order to study patents with both characteristics. The research design is consistent with our econometric approach, with results presented in next section, using three different models for the “sleepers,” the “beauties,” and the “sleeping beauties.”

3.1 Identifying sleeping beauties in nanotechnology

Our first step is to solve the methodological challenge of delineating an emerging field of technology, addressed in the literature by using keywords in bibliometric and patentometric studies (Chen et al. 2008; Huang et al. 2011; Li et al. 2007b; Hullmann and Meyer 2003). A challenge in scientometrics is to delimit the field of nanotechnology in order to meaningfully categorize the researchers as well as their academic and commercial work within nanoscience. There are methodological difficulties since many academics are involved in other disciplines or fields and only occasionally publish and/or patent in nanoscience. We followed the line of research using key words as indicative of community in our study, considering that words and their co-occurrence represent a Kuhnian approach to a deep, narrow community which is reasonably coherent, yet evolving (Kaplan and Vakili 2015).

This paper follows the keywords-based methodology proposed by Porter et al. (2008) for categorizing patents belonging to nanotechnology. We extracted patent data from the 2017 Spring version of PATSTAT. PATSTAT is the official patent database of the European Patent Office (EPO), and provides structured bibliographic and legal data from the EPO’s databases, covering more than 100 million patent records worldwide, from the mid-nineteenth century to current days.

In this study we use the term “patent” to refer to a patent family with at least one citation. An invention can be applied for in several patent offices, forming what is called a patent family. Thus, we integrated citations to different patents in the same cited nanotechnological family from different patents in the same citing family from any field (nanotechnology or not) as a single citation from the citing family to the cited family. Since most patents never receive any citations, we limited our population to patent families with at least one citation. This leaves 65,759 nanotechnology patent families, with priority years ranging from 1956 to 2012.

The second step is to identify nanotechnology patents which are also sleeping beauties. Several methods have been proposed in scientific publications, and here adapted to patents. “Beauty,” in this context, is seen as an impact on later science, and measured by the number of citations, which depends on a paper’s field (Van Raan 2015; Dey et al. 2017). “Sleep” can be more or less deep, depending on low frequency of annual citations and how many citations are needed before they start to make a larger impact. For example, a paper might receive very few citations (none or one) per year for many years, and then suddenly start receiving many citations every year. Of course, the more restrictive these definitions, the fewer sleeping beauties we find (Van Raan 2004). In patents, we acknowledge that backward citations in patents follow different patterns than citations in scientific publications. References to prior art limit the scope of a patent’s protection, and thus the rationales for citing are opposite. The citation lifetime of patents, the peak of citations, and the density distribution of citations over years differ from that found in papers (Narin 1994). Moreover, they vary across technical fields. That is why we used forward citations to measure impact.

We identified impact, or “beauty,” with the number of forward citations a patent receives, as a proxy for the technological value of a patent. The citation distribution curve depends on patents’ technical field (Mariani 2004): the average number of citations, as well as their time period, vary from one field to the other. In this study we limited the patent selection to a single technology, nanotechnology, so one single field. Forward citations to nanotechnology patents, though, can come from patents in any field. Citations in patents have been used to recognize breakthrough inventions as those with a particularly high impact, usually in the top quantiles of the distribution of forward citations, even though there are more advanced methods (see Castaldi et al. 2015). For this study, we defined beauties as those patents in the top 10% of the citation curve (Tur 2016). For our robustness checks, we calculated the patents in the top 1% and in the top 5% where we identified a small amount of beauties. The small amount of beauties in nanotechnology produced even smaller amounts of sleeping beauties at these thresholds (e.g., only three at 5% of the citation curve, with 5 years sleeping time), which made the numbers negligible for further analysis. Therefore, we chose 10% as a threshold for beauties. Furthermore, we did robustness checks with the threshold at 15%. Given this population and our analysis, “beauties” are defined as patents with 22 citations or more.

We also identified delayed recognition, or “sleep” of a patent. Patents have many different associated dates, such as the application date, the publication date, or grant date (if the patent has been granted). A patent’s priority date is the date of the first filing of an application in a patent family. We chose “priority date” because it is closest to the invention date, and less subject to bureaucratic delays such as examination, or formal delays (though there can be strategic in-company delays, as described in Kang and Bekkers 2015). We define a patent’s sleep as the period between the priority date of the cited patent and of the first citing patent. Mirroring our definition of beauty, we define sleepers as those in the top 10% of the sleep length distribution (Tur 2016). Given the population of 65,759 nanotechnology patent families, sleepers are those that do not receive any citations for 4 years or more. (We also ran robustness checks with 3 and 5 years of sleep length in the regressions.)

With these thresholds for impact and delayed recognition of an emerging technology, a sleeping beauty is defined as a patent with at least 22 citations over its lifetime (a beauty) that did not receive citations for at least 4 years after its priority date (a sleeper). The number of sleeping beauties in the database is 162, or 0.25% of the population.

3.2 Variables

Once we had determined our empirical strategy for identifying sleeping beauties, we then created a dummy variable, SB, which is 1 if a patent is a sleeping beauty and 0 otherwise. In addition, we created two intermediate dummy variables, beauty and sleeper, which identify patents that are highly cited and have experienced delayed recognition.

The main independent variable for Hypothesis 1 should reflect direct and strong science-base linkages to the technology, and our proxy is university ownership of the patents. Therefore, we created the dummy variable university, which is 1 if at least one of the applicants works at a university and 0 otherwise. For Hypothesis 2, the main independent variable should reflect indirect and more diverse science-base linkages, and our proxy is the number of backward citations to the non-patent literature, a new interpretation as discussed in Sect. 2.

We also included variables in our empirical model as controls, selecting the most common ones found to affect patent value, measured in citations, in order to avoid biases due to omitted variables. The size of the patent family is the number of applications related to the same invention, and indicates an invention’s market scope (Lanjouw et al. 1998; Harhoff et al. 2003). As each additional application adds to the cost of protection, applicants will only make these costs for the inventions they deem valuable. The number of claims indicates a possibly broader invention, and has been used to measure patent value (Moore 2005; Gambardella et al. 2008). At several patent offices, fees increase with the number of claims, so only patents that the applicant considers valuable will have a high number of claims. Moreover, the number of IPC subclasses indicates a patent’s technological breadth, which has been related to patent value (Merges and Nelson 1990; van Zeebroeck et al. 2009; Petruzzelli et al. 2015): the more subclasses, the more potential applications to many areas. The number of inventors is related to the size of the research project, so a bigger team of inventors relates to more complicated developments (Mariani 2004; Bass and Kurgan 2010). The number of backward citations to patents shows the embeddedness in the current technological trajectory, and therefore also indicates patent value (Criscuolo and Verspagen 2008; Verhoeven et al. 2016).

Table 1 summarizes the explanatory variables and controls, together with their basic descriptive statistics.

Table 1 Definition of variables and descriptive statistics

4 Results

4.1 Descriptive statistics

Below we present descriptive statistics for our two independent variables. We compared beauties (at least 22 citations) to non-beauties (21 citations or less), sleepers (at least 4 years sleeping) to non-sleepers (less than 4 years sleeping), and sleeping beauties (both a sleeper and a beauty) to non-sleeping beauties (neither a sleeper nor a beauty).

Figure 1 shows the percentage of university-owned patents per category. Universities own 15.9% of all patents categorized as beauties and 24.5% of all patents categorized as non-beauties. Universities also own fewer patents categorized as sleepers and fewer patents categorized as sleeping beauties. Our first reflection is that being a sleeping beauty is not associated with university-ownership. Another interesting observation is that similar percentages of beauties and sleepers are university-owned, but the number drops significantly for sleeping beauties.

Fig. 1
figure 1

Percentage of University-owned patents in each category

In the next graph we present the descriptive statistics for the second independent variable, non-patent literature references. Figure 2 shows the difference in the non-patent literature references among the three categories of sleeping beauties, beauties, and sleepers. As expected, beauties have the highest average non-patent references, while sleepers have the lowest. Conversely, the average non-patent literature references between sleeping beauties and non-sleeping beauties converges.

Fig. 2
figure 2

Average number of references to non-patent literature in each category

We follow the same definition of sleepers and beauties to identify sleeping beauties in the general population of patents (all fields) in comparable years (1956–2012). There are 16,388,114 patent families between 1956 and 2012 that have received at least one citation. The beauties are those in the top 10% of the citation distribution: those that have at least 14 citations. Likewise, the sleepers are those in the top 10% of the citation distribution: those that slept for at least 13 years. Note here that the definition in the nanotechnology sector was of 22 citations for the beauties, and 4 years of sleep for the sleepers. The first observation is that nanotechnology patents are cited more often, and also earlier, than the general population of patents (with all fields mixed together). It is expected that different sectors will have different citation cultures (Hall et al. 2005), and indeed we observe differences in the distribution of patent citations and of sleep: nanotechnology patents are cited more often, and earlier. This result points to the need to separate these analyses by sector, to make sure that we are only considering patents that are comparable.

Table 2 presents the number of beauties, sleepers, and sleeping beauties in nanotechnology to the general population of patents. The comparison shows a strong sleeping beauty effect in nanotechnology. That is to say: in nanotechnology a patent is more likely to be a sleeping beauty than in the general population (0.25% vs 0.03%). Moreover, a sleeper is more likely to be a sleeping beauty than in the general population (2.26% vs 0.33%), and a beauty is also more likely to be a sleeping beauty (2.46% vs 0.36%). These differences are all statistically significant at the 0.001 level of confidence. The interpretation of these results is difficult, since the distribution of citations in the general population is mixing different sectors with potentially very different citation cultures. However, they suggest that in nanotechnology, impact can come longer term, since over one in fifty patents that sleep will still experience high impact later in their life.

Table 2 Prevalence of sleeping beauties in the population

The definition of sleeping beauties is intrinsically related to the age of the patent. Older patents have more opportunities to receive citations, and also longer periods for awakening. Thus, it is expected that both the beauties and the sleepers are older patents. Figure 3 shows that this is not the case: both the population of beauties (blue line) and sleepers (red line) are skewed toward later years, just as the overall nanotechnology population (black dotted line). Sleeping beauties (green line), on the other hand, are more distributed over the whole period, exhibiting a different pattern than the other populations. In our empirical analysis, we control for the priority year and exclude the last 4 years in the analysis (equal to the sleeper threshold) to control for age.

Fig. 3
figure 3

Number of patents per priority year in each category (color figure online)

4.2 Econometric results

Based on the theoretical literature, we developed hypotheses to help us identify the significant predictors of what we defined as delayed recognition high impact patents, or “sleeping beauties.” Our dependent variable is a binary variable built on two underlying binary variables (beauty and sleeper). We ran a logistic regression with our main dependent variable (sleeping beauty) and because of the dual composition of our main dependent variable, we ran two additional regression models with each of the two variables as dependent variable. The break into three different models helped us scrutinize the varying effects of each explanatory variable on the different components of the phenomenon. Our dependent variable was an artificial variable based on assumptions of top quartile threshold in terms of citations as well as the patent’s sleeping time.Footnote 3 Last but not least, our control variables are the measurements associated with patent value commonly used in the patent analysis empirical literature. Note that the observations did not have information on some of our controls, meaning some observations are missing compared to Table 2.

We present three econometric models in the population of nanotechnology patents with the same explanatory variables and three dichotomic dependent variables: being a beauty, a sleeper, or a sleeping beauty. We expect patent value to be related to a higher number of citations and also to a shorter sleep. Thus, these variables are expected to have a mixed effect on being a sleeping beauty, pulling in opposite directions. We ran the econometric model in Eq. 1, where \({X}_{i}\) is the binary variable of being a beauty, a sleeper, or a sleeping beauty, and all other variables are defined in Table 1.Footnote 4

$$ \begin{aligned} \log \left( {\frac{P\left( X \right)}{{1 - P\left( X \right)}}} \right) = & \alpha + \beta_{1} \cdot {\text{university}}\;{\text{owned}} + \beta_{2} \cdot {\text{non}}\;{\text{patent}}\;{\text{ literature}} + \beta_{3} \cdot {\text{year}} \\ & + \beta_{4} \cdot {\text{granted}} + \beta_{5} \cdot {\text{inventors}} + \beta_{6} \cdot {\text{claims}} + \beta_{7} \cdot {\text{references}} \\ & + \beta_{8} \cdot {\text{company}} + \beta_{9} \cdot {\text{family}}\;{\text{size}} + \beta_{10} \cdot {\text{number}}\;{\text{IPC}} + \mathop \sum \limits_{{{\text{XnnX}}}} \beta_{{{\text{XnnX}}}} {\text{XnnX}} \\ \end{aligned} $$
(1)

The models include all dummy variables for all IPC subclasses (4-digit codes) with more than 3000 patent families. The results are presented in Table 3. Note that the number of observations is lower than for the general nanotechnology population, due to missing cases when crossing our data with the dependent and control variables.

Table 3 Regression analyses

Patents owned by universities are significantly more likely to be beauties, and significantly less likely to be sleepers.Footnote 5 However, we observed no effect of university ownership on a patent being a sleeping beauty. These results reject our Hypothesis 1 and furthermore indicate that university-owned patents have a higher probability of waking up early.

Our analysis confirms that the more backward citations to non-patent literature a patent has, the more citations it receives. This is shown by the positive, significant effect of non-patent literature on the probability of being a beauty. Nonetheless, we also show that this variable has a negative effect on the probability to be a sleeper. Ultimately, Hypothesis 2 is not confirmed: patents with more non-patent references are not more likely to be sleeping beauties.

The control variables show some expected results. The size of the family and the number of claims have a positive effect on the probability of a patent being a sleeping beauty. The effect of the number of inventors and patent being granted is stronger for sleeping beauties than the other two populations. The number of inventors of a patent has a negative effect on the probability of being a sleeping beauty, suggesting a social cause for sleeping beauties: more authors can diffuse an invention through a wider network of contacts.Footnote 6 Having a patent in the family granted, on the other hand, has a positive effect on the probability of being a sleeping beauty. That is to say, the characteristics that make a patent more likely to be highly recognized also make it more likely to be recognized earlier. The only exception to this behavior is a number of IPC subclasses and company-owned patents.

Finally, once we controlled for all the other variables, the probability of a patent being a beauty is negatively related to the priority year, since older patents have more time to accumulate citations. Similarly, older patents have more time to receive recognition, so the probability of being a sleeper is also negatively related to the priority year. Note here that only patents that are awake appear in our population, excluding from the analysis old patents with no citations.

5 Concluding remarks and discussion

Our study examined whether innovation depends on long-term patterns of interactions in technology and science, using patents in nanotechnology. The previous literature has distinguished between the degree of novelty (science-base) and the degree of technological impact (innovation) of a technology, and stressed that links to the science-base will lead to more breakthrough inventions, due to distant recombinations (Trajtenberg 1990; Kaplan and Vakili 2015; Wang et al. 2017; Fleming and Sorensen 2001, 2004). Our paper then addresses the context of breakthrough industrial inventions in an emerging technology, focusing on the case of nanotechnology. To discover long-term patterns, we developed an empirical strategy to study nanotechnology patents through the metaphor of sleeping beauties, which highlights the combination of delayed recognition and high impact of an emerging technology. Nanotechnology was chosen as an emerging technology, known to be a science-based technology with extensive linkages (Meyer 2000, 2006; Dang et al. 2010; Shapira and Wang 2009; Thursby and Thursby 2011; Guan and Ma 2007; Alencar et al. 2007). Our initial empirical comparison suggests that sleeping beauties occur more frequently in nanotechnology than in the general population of patents.

One of our contributions is that the linkages between science and technology can be studied at a more fine-grained level, namely distinguishing two types of science linkages, “direct and strong science-base” and “indirect and more diverse science-base” as impacting recombinations of technology. Methodologically, we proxy these two types of science linkages as, respectively, university ownership of patents and backward citations to the non-patent literature.

In general, we found that science matters for explaining high impact patents within nanotechnology, in the sense that both types of science linkages lead to more forward citations. This part of our results was expected given a fairly robust finding in the literature across technology fields and IPC classes that science affects technological value. This has been proxied in different ways—including the two we use—and with robust results for university-owned patents (Henderson et al. 1998; Sampat et al. 2003; Bacchiocchi and Montobbio 2009; Crespi et al. 2010; Ljugnberg and McKelvey 2012; Czarnitzki et al. 2012) and backward citations to non-patent literature (Carpenter et al. 1981; Reitzig 2003; Nagaoka 2007). Hence, in this science-based technology, having links to science does matter for explaining the impact of that technology. Our interpretation is that these close or dense relationships to science likely signal a breakthrough invention, which later impacts a technological trajectory. This may also be because the same individual academics are highly influential in articles and patents (Bourelos et al. 2017).

Interestingly enough relative to our contribution of the different types of search, both our hypotheses were rejected, which suggest that our main results on delayed recognition and high impact technologies were not as expected, relative to debates in the current literature. We find that both proxies, university-owned patents as well as the non-patent literature references, have a significant but negative effect on being a “sleeper” and are not significant for explaining “sleeping beauties.” For the first hypothesis, for patents with direct and strong science-base linkages to the technology, we expected, based on reasoning by Fleming and Sorensen (2001, 2004) that this is due to expected recombinations of highly coupled, interdependent components; these inventions therefore may take longer, because they are further away from the current localized search. For the second hypothesis, for patents with indirect and diverse science-base linkages, we expected, based on Kaplan and Vakili (2015), that recombining technologies that depend on highly coupled, interdependent components may require an indirect link to science, in order to try out a wider number of possible combinations with a local search around application areas. Thus, these inventions may take longer, because more combinations need to be tested.

We go beyond the existing literature, in order to stress the importance of having indirect and diverse science-base linkages to the technology for signaling technological impact, which in turn may lead to potential future breakthrough inventions in nanotechnology. Our proxy is backward citations to the non-patent literature, as we have argued these are informal and temporary ways of searching across a broader range of scientific fields and industrial application areas. Our reasoning is that making recombinations of technology that depend on highly coupled, interdependent components may require an indirect link to science, in order to try out a wider range of possible combinations with local search around application areas, and hence these inventions may take longer, because more combinations need to be tested. We propose that the non-patent literature should not be considered science linkages in general, which can be considered an extension of arguments also found in (Callaert et al. 2006), and more specifically, we propose that they reflect a search among various types of codified and informal technological and scientific knowledge.

Contrary to expectations, in nanotechnology, we find that science-based patents mainly have an earlier impact, and possibly also a more direct impact on industrial invention. Hence, moreover, within such science-based technologies, we propose that the long-term patterns of delayed recognition and high impact may instead be explained by developing in future a more nuanced understanding of the role of firms in combining multiple knowledge sources, seeing the firms as knowledge-intensive firms required for innovation. Firms are to combine different types of scientific and technology knowledge—and thereby industrial invention and innovations within firms requires the firm to manage diverse cognitive communities as argued by Nightingale 1998 and to manage an evolutionary problem-solving process with multiple ambiguities and grey zones, as argued by McKelvey and Saemundsson 2021. This interpretation can be in line with other findings, in that our control variables of IPC application class and company-owned patents show some degree of significance. We conceptualize that these represent a firm context, and specifically the need to combine many technologies within firms’ industrial inventions. Unlike the representation of narrow application areas such as by Kaplan and Vakili (2015), the application areas for nanotechnology are proposed as quite dense and complex, suggesting the need to recombine multiple application areas to achieve delayed breakthrough inventions, in a corporate setting. Hence, within the science-based technology of nanotechnology, delayed recognition and high impact specifically require multiple technologies, specializations, and industrial applications. Regarding implications for regions interested in developing nanotechnology (and other emerging technologies), our interpretation suggests the need to have a supportive regional knowledge base, predominately based on analytical (scientific) or synthetic (industrial) knowledge, identified by Asheim and Coenen (2005), e.g., a density of connected universities and knowledge-intensive firms. Thus, we suggest that it is important to further understand the knowledge base and impacts of large companies for stimulating potential breakthrough inventions in emerging technologies.

We acknowledge this study has many limitations due to data and approach, and as such, this article makes limited contributions as well, but also opens up interesting future areas of research. A first group of limitations has to do with choices related to the dataset. We are also limited to EPO data, and it is possible to examine backward citations from different patent offices, in order to follow previous literature (Alcacer and Gittelman 2006; Criscuolo and Verspagen 2008) and study the effect of whether assigning citations in patents comes from the examiners or the inventors. This study compares nanotechnology to all patents, but our research design only used EPO data for nanotechnology as indicative of other emerging technologies for our questions. Comparisons could be made with different types of technologies or else with other key enabling technologies identified by the EU or other policy makers, as done by Neuländtner and Scherngell (2020), to examine the generalizability of our findings. Moreover, in our definition of beauties, we aggregated citations from a citing DOCDB family to a cited DOCDB family, which is a much better approach than taking citations from a citing individual patent to a cited individual patent. Nonetheless, it does not take into account that some patent offices add more references to their patents, and thus receive more forward citations. Thus, the bigger the family, the more likely it is to contain at least one patent from such a patent office, and thus receive more citations. We have corrected this by controlling for the size of the family in our analysis, but a more advanced approach could define highly cited patents by patent office, before aggregating the patent family, in future research. A second group of future research could better explain our findings and proposed explanations. We acknowledge that the paper looks at the overall phenomenon of delayed recognition high impact technology, which could be further understood through more fine-grained analysis. Hou and Yang (2019) have identified different patterns of delayed recognition in patents. Future research could focus on the characteristics and position of the first citing patent. We can think of this as the one which “awakens” the sleeping beauty, a so-called prince in this metaphor. The first citing patent—or the first large cluster of citing patents—should be analyzed, because it signals the point in the technological development where our studied technology starts to greatly impact the overall technological trajectory. Our study does not explain why delayed recognition and high impact may occur together. Future research could delve into explicit explanations. Reasonable alternative explanations can be formulated as follows. A new advance at the frontier of technological progress may be ahead of its time and remain latent until complementary knowledge that builds on it has been developed. An alternative hypothesis would be that the social network of inventors is determinant for the diffusion of inventions and isolated actors with a weak social position lack the means to make their inventions noticed. This second explanation would reveal a shortcoming of the technology system, since important developments are ignored, delaying further technological development. In such a case, there may be scope for policy action to correct this flaw in the diffusion of technologies. A complementary study along these lines could investigate more details of how and why knowledge, geography, and networks are interrelated in explaining knowledge creation and diffusion in this case (in line with Henning and McKelvey 2020; Bathelt and Glückler 2003; Bathelt et al. 2004).