The “Mendel syndrome” in science: durability of scientific literature and its effects on bibliometric analysis of individual scientists
- 958 Downloads
The obsolescence and “durability” of scientific literature have been important elements of debate during many years, especially regarding the proper calculation of bibliometric indicators. The effects of “delayed recognition” on impact indicators have importance and are of interest not only to bibliometricians but also among research managers and scientists themselves. It has been suggested that the “Mendel syndrome” is a potential drawback when assessing individual researchers through impact measures. If publications from particular researchers need more time than “normal” to be properly acknowledged by their colleagues, the impact of these researchers may be underestimated with common citation windows. In this paper, we answer the question whether the bibliometric indicators for scientists can be significantly affected by the Mendel syndrome. Applying a methodology developed previously for the classification of papers according to their durability (Costas et al., J Am Soc Inf Sci Technol 61(8):1564–1581, 2010a; J Am Soc Inf Sci Technol 61(2):329–339, 2010b), the scientific production of 1,064 researchers working at the Spanish Council for Scientific Research (CSIC) in three different research areas has been analyzed. Cases of potential “Mendel syndrome” are rarely found among researchers and these cases do not significantly outperform the impact of researchers with a standard pattern of reception in their citations. The analysis of durability could be included as a parameter for the consideration of the citation windows used in the bibliometric analysis of individuals.
KeywordsDurability Obsolescence Bibliometric indicators Individual level analysis Micro-level analysis Mendel syndrome
Among bibliometric analysis, micro-level approaches present a high policy interest. Not only are they useful for supporting research assessment processes (Sandström and Sandström 2009) but also as tools for the study of the behavior of researchers, their publication strategies, their organization in teams (Bordons et al. 1995a, b), etc. In this sense, the study of scientific production at the individual level offers highly valuable information for a much better understanding of the scientific processes in general. However, the micro-level (and especially the individual level) is one of the most difficult and problematic levels of analysis in bibliometric studies (Costas and Bordons 2005). The most important problems are related with the lower validity of statistical indicators applied to small units, the requirement of higher levels of precision and recall in the collection of data for individuals, or the risk of side-effects and ethical problems that the inadequate and unbalanced use of bibliometric indicators at this level can raise (Butler 2008). All these elements and ethical concerns need to be taken into account when using bibliometrics for assessment purposes (Weingart 2005) especially at this level, finding their best application in combination with peer review in what has been labeled as “informed peer review” (Nederhof and van Raan 1987; Aksnes and Taxt 2004).
Another central element in bibliometric research is the analysis of the obsolescence (Line 1993) or “durability” of knowledge (Tahai and Rigsby 1998). The analysis of the ageing of scientific production has been frequently addressed in bibliometric literature (Aversa 1985; Glänzel and Schoepflin 1995; Moed et al. 1998; Aksnes 2003), paying special attention to related concepts such as “scientific prematurity” (Stent 1972; Glass 1974), “delayed recognition” (Garfield 1980, 1989; Glänzel et al. 2003) or “Sleeping Beauties” (van Raan 2004). Also several indicators based on the age distribution of references and citations have been suggested and used in the scientific literature during many years, being some of the most important the “Price Index”, the “Immediacy Index” or the “Cited Half-Life” (Burton and Kebler 1960; de Solla Price 1965; Moed 1989), which are currently still in use (Magri and Solari 1996; Meadows 2004; Amat and Yegros Yegros 2009).
Different reasons for delays in the reception of new scientific ideas have been suggested (Campanario and Acedo 2007) including resistances to conceptual changes (Campanario 2002), errors in judgments, conceptual conservatism, rivalries, lag time for the discipline to mature conceptually, lack of technological or analytical tools (Graham and Dayton 2002), etc.
The effects of durability of knowledge are important not only for the scientific communication itself (Pollmann 2000) but also for the proper calculation of bibliometric indicators (especially the determination of citation windows). The idea of “delayed recognition” has been frequently used to discredit bibliometric indicators for their use in science policy (Garfield 1970; Glänzel 2008) claiming that some ideas, publications and researchers are so “ahead of their time” that it is not possible to properly analyze their research performance through bibliometric indicators based on time windows relevant in a research policy context.
In Van Raan’s paper (2004) as well as in the work of other authors (Glänzel et al. 2003) empirical evidence was given that “Sleeping Beauties” are extreme and rare cases of delayed recognition. Although they indeed happen, their low frequency makes them a peripheral and quite a rare problem in the application of bibliometric indicators and research assessment. Recently, a new and field-normalized methodology for the study of the durability of scientific literature has been developed (Costas et al. 2010b). This methodology, based on a more general approach for the consideration of the durability of scientific literature (not based on fixed citation-windows but on the distribution of citations over time among the different fields), allows the classification of all publications covered by citation databases (i.e., Web of Science or Scopus) in three types of durability: “Normal”, “Flash in the pan” and “Delayed” papers. Thus, the durability of the publications of different units of analysis (e.g. research teams, individuals, etc.) can be studied from a more general perspective. This enables us to respond (for the first time) to the request made by Garfield (1990) for a handy yardstick to detect and measure delayed recognition. In other words, with this methodology it is possible to study the durability of the production of different scientific units (individual researchers, research teams, universities, etc.) and to detect the possible effects that durability can have on the measurement of the performance of these units of analysis.
In 1979, Garfield suggested the existence of a kind of “Mendel syndrome” in science related with the “inability of citation counts to identify premature discoveries—work that is highly significant but so far ahead of the field that it goes unnoticed”. This idea has later been explored by other authors (Beed and Beed 1996; Cohn et al. 1998; Rodriguez-Ruiz 2009). Foss (1995) defined the “Mendel syndrome” as “strikingly original contributions that are neglected in their own time, only to be hailed by later generations”. In a similar line, Van Raan (2004) defined the “Mendel syndrome” as scientists claiming “that one or more of their publications will not be picked up for a while, as they are ‘ahead of time’ ”.
According to these definitions, we suggest that an extension of the “Mendel syndrome” or “Mendelism”1 is the situation in which scientists (or any unit of analysis) develop lines of research and have a profile of publications (‘oeuvres’) “ahead of their time”. This implies that their production is not properly acknowledged by contemporary colleagues and that they are not recognized until a period of time is passed, which generally implies that they will “suffer” from a delay in receiving citations. In other words, the “Mendel syndrome” can be defined as the undervaluation through citation analysis of units (individuals, teams, etc.) due to significant patterns of delayed reception of citations in their scientific publications. Thus, the analysis of the “Mendel syndrome” focuses more on detecting “delayed” patterns in the whole oeuvre of researchers than on the detection of occasional “delayed” publications or “Sleeping Beauties”.
To our best knowledge, there are no approaches that study the scientific performance of individual researchers considering the ageing or durability of their publications (i.e. from a micro-level perspective). In this sense, questions such as “can bibliometric performance assessment of scientists significantly be affected by publishing papers that need more time than standard to be acknowledged by their peers (i.e. ‘delayed’ papers)?” Or the opposite situation, “are there scientists who mainly publish papers that are rapidly cited after publication but forgotten after some time (i.e., ‘flash in the pan’ papers)?” These questions have been hardly studied before, and their answers will provide important knowledge about the scientific communication process itself. The study of these questions will not only help to appraise the validity of bibliometric indicators for research assessment (in this case, at the individual level) but will also provide more insight into the determinants and characteristics of researchers producing literature with different patterns of durability.
The general objective of this analysis is to study the effect of differences in the durability of scientific knowledge based on bibliometric indicators for the analysis of individual researchers.
The main question is whether assessments of scientists can be significantly affected by the presence of papers with different types of durability within their publication profiles. Our target is to analyze to what extent individual scientists produce papers with different types of durability, and how these different types of durability affect the analysis of the performance of their oeuvres from a bibliometric point of view. In our approach, we intend to find out whether the “Mendel syndrome” has a basis in reality or should be relegated to the realm of myth (Glänzel and Garfield 2004; Glänzel 2008).
To study the presence of papers with different types of durability within the output of individual researchers, paying special attention to the distribution of researchers into scientific performance classes, the age of researchers and their experience working at the same institution.
To study in what way scientists are affected by different types of durability, analyzing whether that changes significantly their position in rankings based on bibliometric indicators.
To detect possible researchers with deviant patterns of durability in their production and to study their main characteristics.
Data and methodology
For the analysis of durability at the individual level, 1,064 researchers working at the Spanish Council for Scientific Research (CSIC) were considered. These scientists belong to three different organizational research areas: Natural Resources (349), Biology and Biomedicine (388) and Materials Science (327).2
The scientific production of these researchers has been downloaded from the Web of Science database for the period 1994–2004. A total of 24,982 publications were collected and assigned to these individual scientists.
Individual level data
“Top performance class”, scientists with scores higher than the percentile 75 in these dimensions;
“Medium performance class”, scientists with values of performance between percentiles 75 and 25 in the three dimensions;
“Low performance class” with the lowest scores (lower than percentile 25) among the three dimensions.
Thus this methodology offers a quite balanced classification of researchers based on their performance on the basis of different bibliometric dimensions. The most remarkable innovative feature of this approach is that top performance is not linked to a single indicator alone but to performance indicators pertaining to several scientific dimensions (Production, Impact and Journal Quality) and also depending on the distribution of all researchers across these dimensions (calculations based on percentiles). Thus, top researchers are those with a high performance not only regarding production but also with regard to citation impact and the quality of the journals in which they publish (for a detailed description of these three classes see Costas et al. 2010a).
Durability classification of publications
Normal papers,3 papers with a standard pattern of ageing in comparison to the papers in the same field.
Delayed papers, papers that receive their citations later than the average publications in their research fields. From a “technical” perspective we define them as those publications that receive 50% of their citations when 75% of the publications in the same research field have already received 50% of their citations.
Flashes in the pan, papers receiving citations immediately after their publication but loosing their impact sooner than average papers. This type of durability has been suggested by Zuckerman and Miller (1980) and Van Dalen and Henkens (2005) as being the opposite of papers with delayed recognition. The “technical” definition of flash in the pan is that these papers already received 50% of their citations when 75% of publications in their research fields did not yet receive 50% of their citations.
It is important to mention that the classification of publications in different durability types is based on their citation histories as compared to the citation histories of all other papers published worldwide in their research fields in terms of the Web of Science subject categories. For a complete explanation of the process of document classification by durability types we refer to Costas et al. (2010b) and also to van Raan (2004) for a description of the “Sleeping Beauties” analysis. It should be mentioned that the methodology in Costas et al. (2010b) is different from that used by van Raan (2004) for the analysis of the “Sleeping Beauties”. The “Sleeping Beauties” methodology focuses on the detection of unique and single highly cited “delayed” publications, based on fixed years and levels of citations; while the “durability” methodology (the one used here) is more ‘flexible’ and focuses on the analysis of the different patterns of citation histories of publications across the different fields. Accordingly, the idea of “delayed” here does not necessarily mean “highly cited”. In other words, we assume that both highly cited and infrequently cited publications can have the same citation pattern (i.e. they age at the same rate), something that was also observed by Noma and Olivastro (1985) in an analysis of patents. As a simplification we can say that all “Sleeping Beauties” are also delayed publications,4 but not all delayed publications are necessarily “Sleeping Beauties”.
For the classification of publications into types of durability, we consider in this study citations from the year of publication of the paper until 2008 and the WoS classification in subject categories.
Total number of publications (P) indexed by the Web of Science database during the period 1994–2004 (all document types included). These are the “source” publications assigned to the 1,064 individual scientists under study.
Total number of external citations (C). These are the citations received by the above mentioned papers, excluding self-citations. Two citation periods for the counting of citations have been considered:
1994–2004, this is the same time window as for the source publications. Thus, publications published in 1994 have 11 years of citation window, publications in 1995 have 10 years, and so on.
1994–2008, that means that papers published in 1994 have 15 years of citation window, publications published in 1995 have 14 years of citations and so on. Thus the citation window for each paper is extended by 4 years.
Citations per Publication, excluding self-citations (CPP). Also for this indicator both citation periods (1994–2004 and 1994–2008) were considered.
Indicator CPP/FCSm is the comparison of the CPP with the field-based worldwide average impact (FCSm) (cfr. Moed et al. 1995 for a broader explanation of these indicators). Values of this indicator above 1 indicate that the citation impact of a researcher is above the international reference value, while scores below 1 indicate the opposite. Also for the calculation of CPP and CPP/FCSm both citations period were considered.
The three areas show different patterns of ageing. Biology and Biomedicine and Materials Science present an initial trend of steep increase in the number of citations during the first 3 to 5 “years after publication”.6 After that they decrease at a quite moderate rate. Papers assigned to the area of Natural Resources reach their peak in citations later than papers assigned to the other two areas (5–6 years following publication), but after that the decrement in the number of citations is still very slow.
Natural Resources has proportionally more delayed papers than the other two areas. Because scientific literature in this area has a longer life-span, more papers are classified as delayed publications. On the other hand, Materials Science contains proportionally more flashes in the pan than the other two areas. Biology and Biomedicine is the area with the highest percentage of normal publications. A χ2 test shows p < 0.000, suggesting that the hypothesis of independence among variables (i.e. area and durability types of publications) must be rejected and implying that the distribution of durability types of publications among research areas are significantly different.
Figure 3 shows how flashes in the pan present a rapid increase in citations during the first years after publication, sometimes being more cited than delayed and normal papers (in Natural Resources). The impact of normal publications (CPP) increases during the first years after publication, reaching a peak after 3–5 years and then decreasing with time. Finally, delayed papers have a more stepwise growth in citations and they reach their peak much later in time. With respect to this type it is very interesting to observe the case of Natural Resources, where delayed publications do not reach a clear peak during the 15 years considered.
In the following sections the results at the individual level are presented focusing on the distribution of the three durability types of publications for the different individual scientists involved in the analysis.
General distribution of durability types of publications at individual level
As can be expected, the normal type is the most common type within the profile of individual researchers, fluctuating between 60 and 70% among the three areas. Delayed papers and flashes in the pan represent each less than 20% of the total number of publications at the individual level. In Natural Resources and Materials Science the percentage of delayed papers is higher than the percentage of flashes in the pan, while in Biology and Biomedicine researchers have a higher percentage of flashes in the pan as compared to the share of delayed papers but also slightly more normal papers than in the other two areas.
Distribution of durability types of publications by performance class
Scientists belonging to the top performance class have in all three areas the highest levels of normal publications and the lowest levels of delayed papers and flashes in the pan compared to the other two classes. In this sense, a positive relation can be observed between the share of normal publications and scientific performance class. On the other hand, the shares of delayed and flash in the pan publications increase as the scientific class of researchers decreases. These patterns show that the production of normal publications is the most common pattern among top researchers, while medium and low class researchers present proportionally more flash in the pan and delayed publications. Statistical significant differences were observed in practically all cases (p < 0.05).
Distribution of durability types of publications by age
In all areas, we observe that young researchers publish more normal publications than senior and veteran researchers. Statistical significant differences were found (p < 0.05) between young and veteran researchers in Biology and Biomedicine and in Materials Science regarding delayed and normal publications.
Impact of publications of individual researchers considering the durability types of publications
As expected, the average impact scores for scientists increase for the three types of durability as the length of the citation measurement window increases (Wilcoxon Signed Ranks Test p < 0.000 for all cases and areas). However, it is important to stress that for the three areas scientists have higher CPP values in their normal papers as compared to the other two types of durability, regardless of the citation period (p < 0.000 for all cases) considered. Clearly, the CPP of flash in the pan papers is the same or higher than that of delayed papers for the shorter period (1994–2004); but when the period is extended (1994–2008), the CPP of delayed papers is always higher than that of flashes in the pan (Wilcoxon p < 0.000).
A similar pattern as previously observed for CPP can be observed here. The distribution of the CPP/FCSm of researchers in their normal papers shows the highest scores as compared to the other types of durability regardless the citation period (Wilcoxon Signed Ranks Test p < 0.000). Natural Resources researchers significantly increase their CPP/FCSm from one period to the other (p < 0.000), while in Biology and Biomedicine and Materials Science there is only a slight decrease in the impact (although still significant, Wilcoxon p < 0.05). These results can be linked to the fact that the three areas display different ageing patterns, and particularly that the area with the slower ageing pattern (Natural Resources) still displays an increasing pattern in CPP/FCSm with time, while the other two (with faster ageing patterns) show a slight decrease in the normalized citation impact scores.
On the other hand, researchers increase significantly in CPP/FCSm with their delayed papers for the longest citation period while flash in the pan papers decrease (p < 0.000 in both citation periods and for the three areas). This finding clearly shows the beneficial effects of longer citation windows on delayed publications and the opposite for flashes in the pan. In this line, during the shorter period (1994–2004), the CPP/FCSm of individual researchers in their flash in the pan papers, is higher than that for their delayed publications (p < 0.05). The opposite situation occurs when the period of citation is longer (1994–2008): the CPP/FCSm of delayed production is higher than that of flashes in the pan (again p < 0.000).
In view of these results, it can be stated that with longer periods delayed papers improve the field-normalized impact of individual scientists, whereas the presence of flash in the pan papers reduces the field normalized impact.
Effects of durability on the rankings of individual researchers
According to the previous results, delayed and flash in the pan publications clearly play different roles in the evolution of the impact of the oeuvres of individual researchers. However, is the presence of these papers significantly important in the overall evaluation of researchers? In other words, what are the effects of the different types of durability on the measurement of the overall performance of individual researchers? Are the different durability types of publications a crucial element in the assessment of scientists based on bibliometric indicators?
Pearson’s correlations by publications (three areas combined)
P Flash in the pan
P Flash in the pan
As can be observed in Table 1, all rankings are significant (p < 0.000) indicating that the hypothesis of linear independence is rejected, or in other words, that the positions of researchers (in the rankings) according to the production of the different types of durability are significantly related. In this sense, having a high production (P Total) also implies having a high number of publications among the 3 types of durability. It is interesting that P Total is closely related to the production of normal papers (P Normal) with the highest Pearson’s correlation coefficient, while the lowest correlation is observed between P Delayed and P Flash in the pan, although the relation between both indicators is still significant.
Pearson’s correlations by CPP/FCSm and durability types of publications (all three areas combined)
CPP/FCSm Flash in the pan
Period of citations 1994–2004
CPP/FCSm Flash in the pan
Period of citations 1994–2008
CPP/FCSm Flash in the pan
According to Table 2, it is remarkable that the rankings of scientists by the CPP/FCSm of Delayed and Flash in the pan papers are statistically independent (i.e., the correlation is not significant—shaded blocks), whereas their correlation with CPP/FCSm Total and Normal is not independent. In other words, if scientists are ranked by the CPP/FCSm of only their delayed and flash in the pan papers, their position in the rankings would be significantly different. However, this will not affect significantly the positions of scientists in their scientific community (in terms of CPP/FCSm) when the whole production is considered.
In Appendix 2, complementary figures on the rankings of researchers by P and CPP/FCSm considering the different durability types of publications and also the two citation periods are presented.
Are there researchers suffering from a potential “Mendel syndrome”?
At this point we can ask ourselves if there are scientists with an unusually high amount of delayed (or flash in the pan) publications within their profiles. Particularly, we wonder if there are scientists with such “deviant” levels of delayed papers that one could suggest that they are suffering from the “Mendel syndrome” (i.e., having a higher presence of delayed publications within their publication profile). Conceptually, based on the percentages of publications by durability types, we can assume that there are three potential groups (clusters) of authors that are of interest for this analysis: one group of authors who would focus more on publishing Flashes in the pan; another one with authors presenting more Delayed publications, and finally a third group of authors for whom Normal publications are the norm.
Considering the previous assumption, we have performed an exploratory cluster analysis based on the distribution of the percentages of the three durability types for scientists using the k-means method. The k-means algorithm (MacQueen 1967) is one of the simplest and most widely applied nonhierarchical clustering techniques (Kaufaman and Rousseeuw 1999). The algorithm focuses on partitioning a population into k sets. This process gives partitions (clusters) which are reasonably efficient in the sense of within-class variance.8 In this paper, we use this algorithm as it is implemented in SPSS 17.0 with the purpose of detecting groups (clusters) of authors who are different in their percentages of Normal, Delayed and Flash in the pan publications.
Three clusters based on the distribution of publications by durability types
% Top researchers
%Flash in the pan
Biology and Biomedicine
Secondly, Cluster 2 deals with higher percentages of delayed publications (in the case of Natural Resources it is the highest percentage, but for Biology and Biomedicine and Material Sciences the number of normal papers is still the highest). We labeled this cluster as “+Delayed” (with this label we mean that Delayed papers are the deviant element in the profile, but not that they necessarily constitute the majority in the production of the researchers).
Thirdly, Cluster 3 includes scientists with higher shares of flash in the pan papers. Again in the case of Natural Resources, the flash in the pan share is the highest, while for the other two areas normal papers represent still the largest share. We labeled this cluster as “+Flash in the pan” (following similar criteria as in the interpretation of “+Delayed”). It is important to emphasize that although Natural Resources present a sharper distinction between the three clusters (the shares of Delayed and Flash in the pan papers are much higher than in the other two areas), the relative amount of researchers in these two clusters is very low (only about 12% of all researchers in this area) as compared to the other areas.
In Appendix 3 the correlation between the CPP/FCSm scores for 1994–2004 and 1994–2008 is presented for the three clusters, showing how +Delayed individuals have a pattern of impact more biased towards the second period (i.e. they benefit from the longer citation period), while “+Flash in the pan” display a bias towards the first period (i.e., showing the highest impact in the first period). Another interesting finding is that the mean age of “+Delayed” and “+Flash in the pan” researchers tends to be higher than for the other cluster, thus supporting the previous observation that younger researchers tend to produce papers with a more normal durability pattern, while the older ones produce relatively more flashes in the pan and delayed publications.
The most remarkable observation in Fig. 10 is that researchers in the cluster “+Normal” present a higher level of CPP/FCSm in both periods as compared to the other two clusters (p < 0.05). That means that even with the longest period, researchers with normal production have still the highest impact.
For the “+Delayed” and “+Flash in the pan” clusters we do not find statistical significant differences in CPP/FCSm between the two periods. The exception is the CPP/FCSm for the longer period 1994–2008 of Natural Resources. In this case the CPP/FCSm of “+Delayed” researchers improves enough to statistically outperform “+Flash in the pan” researchers (but not “+Normal” researchers).
According these findings we can argue that cases of researchers with a “+Delayed” pattern of publications are not very common. Therefore, this low occurrence within the whole population of individual scientists suggests that the probability of having researchers seriously suffering from a potential “Mendel syndrome” which would significantly affect their performance and evaluation assessed by bibliometric indicators is low. As shown earlier, it relates only to about 12% of the around 1,000 scientists involved in our study with only statistical repercussions in the comparison with “+Flash in the pan” researchers.
To give a policy-relevant indication, based on the results presented in Fig. 2 and Table 3, we can state as a rule of thumb that those scientists with less than 60–70% publications with normal ageing patterns should be subject of careful analysis in order to check if they would suffer from a potential “Mendel syndrome”.
Discussion and conclusions
This study seeks to contribute to fill a gap in the bibliometric knowledge of the ageing of papers and the effects of delays in the impact of publications for research evaluation at the level of individual scientists. More specifically, our analysis on the bibliometric performance of 1,064 Spanish individual researchers takes the different durability types of their publications into account. Although the population studied may present its own particular characteristics (scientists from the same institution and country—CSIC and Spain-, with a permanent position, three research areas, etc.) we think that the results of this study are general and sound enough for the understanding of the influence and effects of durability of scientific literature at the individual level. Besides, the fact that very similar patterns have been observed among three quite distinct research areas suggests that these patterns could be expected in other fields and groups of researchers. In any case, our future plans include a more international view, combining more multidisciplinary approaches dealing not only with scientometrics analysis, but also with sociological and historical aspects, as well as the analysis of more individual researchers, from more different disciplines, different countries, etc. in order to discuss and expand the results here presented.
Differences in durability by research areas
Glänzel and Schoepflin (1994) argued that the ageing of scientific literature is strongly influenced by differences across research fields. This observation has been corroborated in our analysis showing that the evolutions of the average impact of publications for the three areas over time are different. Of the three areas studied, Biology and Biomedicine papers have the earliest peak in impact, in the third year after publication, and then a fast decrease in citations. It is also the area with the lowest share of delayed papers. Natural Resources is the area with the latest peak (around the fifth year), with a slow decrease in the impact, and with the most “delayed” citation pattern. This also becomes clear by the finding that Natural Resources researchers have the highest percentage of delayed papers and the lowest percentage of flash in the pan papers. Finally, Materials Science papers show an evolution in their impact similar to Natural Resources but with a faster decrease in their impact over time. Materials Science is also the area with the highest percentage of flash in the pan papers which indicates a stronger ageing pattern in this field. Yet, more research would be necessary in order to better understand the differences in the durability patterns across research fields.
A common pattern found among the three areas is that normal publications have the highest impact as compared to the other two durability types of publications. Delayed papers are the ones with the highest increase of citations over time while for flash in the pan papers we found a decrease in relative (field-normalized) impact as the citation period for publications becomes longer. This result is very relevant for indicators based on “citation speed” (Bornmann and Daniel 2010) as these indicators could be influenced by papers and units with more flashes in the pan. These results also mark the fact that a citation window that is long enough (e.g., 5 years) is important in order to get a better balance between delayed and flash in the pan papers. In this line it is important to keep in mind that in general the oeuvres of researchers are comprised of a combination of different durability types of publications (normal, delayed and flashes in the pan), as well as a combination of different years of publication (i.e., different citation windows are applied), thus contributing to balance the different durability effects of publications.
Effects of durability at the individual level
Generally, individual researchers tend to present around 60–70% of their papers as normal papers, and only around 20% or less of their publications as delayed and/or flash in the pan papers. Another interesting conclusion is that the impact characteristics of individual scientists are affected by the ageing patterns in their research fields. On the one hand, Natural Resources researchers have the highest percentages of delayed papers and the lowest share of flashes in the pan, being in line with the observations that papers of this area show a more delayed citation pattern. Materials Science researchers display the lowest percentage of normal publications and the highest rates of flash in the pan papers, while Biology and Biomedicine researchers have the highest rate of normal papers and the lowest share of delayed papers, suggesting that these two areas have a faster pattern of ageing that is reflected in the citation impact of their scientists.
Regarding the performance classes of researchers (Top, Medium and Low), top class scientists have the highest shares of normal papers and the lowest shares of delayed papers and flashes in the pan. Accordingly, these results suggest the idea that the “best” scientists tend to publish papers that are accepted and cited in a regular and a reasonable time by their peers. Another explanation, in line with the claims of Van Dalen and Henkens (2004), is that authors with distinguished reputations (e.g. top researchers) have an advantage when competing for attention, preventing them for being delayed in the reception of their publications.
In this latter context, it can also be suggested that top scientists not only succeed in producing important scientific results, but also are successful in communicating these results, in a way that is properly understood and accepted by their contemporary colleagues. So they are characterized by producing less “trendy” but quickly forgotten papers (i.e. flash in the pan papers) and also less papers that need more time for being accepted by their colleagues (i.e. delayed papers). Thus, the ability of scientists for producing steady and directly relevant knowledge that is appreciated by their current peers seems to be a good property for the best reception of scientific work. This is in line with the claim of Garfield and Malin (1968) that “modern cases of Mendelism” are more “the own fault” of researchers as they are not able to “sell” and communicate their ideas in a proper way. Even Mendel’s case has been sometimes attributed to a failure in communication (MacRoberts 1985) rather than to a scientific community neglecting his results. Thus, research managers can benefit from this methodology as they can detect individuals in their organizations who potentially could benefit from programs targeted to improve their scientific communication skills and to prevent potential cases of Mendelism.
In line with the previous statements, the age of scientists is also linked to the distribution of papers by durability. Younger scientists have proportionally more normal papers while older researchers have more delayed and/or flashes in the pan papers. This can also be related with the observation that younger researchers have the best bibliometric profiles (younger researchers tend to be Top performance class) as observed by Costas et al. (2010a). Therefore, the often heard suggestion that scientific results of younger researchers may be “resisted” (and therefore delayed) by older researchers (Barber 1961) seems not to be sustained. Younger scientists tend to publish even more normal papers than their older colleagues. This is also in agreement with Cole (1970) who already claimed that delayed papers are not more likely to be written by younger scientists than papers receiving immediate recognition.
Another interpretation of these results is that while younger scientists produce more regular mainstream science (i.e. normal papers), older and more experienced scientist participate more in scientific debates (producing more flash in the pan papers) and they also publish more on topics outside the mainstream patterns of research within their scientific fields and therefore they produce more delayed (and/or flash in the pan papers).
Our results show that normal publications are the most important in terms of high individual performance and scientific success, while delayed and flash in the pan papers are more an exception than that they would be a dominant characteristic of the publication oeuvre of researchers. Accordingly, although delayed and flash in the pan papers have different effects on bibliometric indicators at the individual level (delayed papers tend to increase the relative impact of researchers with time, while flash in the pan papers tend to decrease it when the citation period is lengthened), their influence on the overall impact scores of individual scientists does not play an important role. In other words, the evaluation of scientists is generally not significantly affected by the presence of these durability types among their publication output. Besides, the fact that there are not many researchers with very deviant levels of delayed or flashes in the pan suggests that these two types of durability occur in a quite balanced way within the profiles of individual researchers.
The “Mendel syndrome”
Among the three areas and considering the shares of durability types of publications within the bibliometric output profiles of researchers, we can distinguish three main clusters of scientists, based mainly on the higher or lower presence of normal papers.
The first cluster presents shares of normal papers around 70–80%, whereas the other two types of durability account around 20–30% of the total publications of researchers. Furthermore, this is the cluster that covers around half or more of the researchers (for Natural Resources and Biology and Biomedicine more than 60% of researchers belong to this cluster, for Materials Science 47%).
The second cluster is composed by individual scientists with lower shares of normal papers (generally below 60%) and higher levels of delayed papers (shares higher than 30%). Scientists in this cluster could be considered as potential candidates to suffer from the “Mendel syndrome”. However, the analysis of their relative impact based on a longer citation period indicates that their position when ranked on normalized citation impact does not really improve as compared with researchers with a normal pattern of durability. Only in Natural Resources this higher share of delayed papers can affect the measured performance of researchers when compared to their “flash in the pan-oriented” colleagues. This suggests that, although a small proportion of researchers can suffer from delayed patterns in their impact, taking longer periods of time (e.g. 5 years) for the citation window does not necessarily imply that this will improve their performance as compared to their colleagues. It could be argued that perhaps with much longer citation windows (longer than 5 years) these researchers could outperform their colleagues. But then the length of the citation window needed to conduct an assessment may preclude the evaluation of the research output of a scientist by bibliometric means that is still within a reasonable time frame from a research policy context.
The third cluster is composed of researchers with less than 60% of normal papers and generally more than 20% of flash in the pan papers. The impact of these researchers remains at the same level or slightly decreases when the citation window is extended. However, their impact position as compared to their colleagues does not change significantly, probably due to the still dominating presence of normal (and some delayed) papers in their oeuvres.
According to the results of our study, we can conclude that there are indeed papers with different durability patterns and that these patterns can be detected very well through bibliometric indicators (Glänzel et al. 2003; Van Raan 2004). However, at the same time we can also conclude that at the most basic level of aggregation (i.e., the individual scientist) a significant effect of these durability types of publications (and especially the “Mendel syndrome”) on the bibliometric scores of individual researchers can not be observed.
Bibliometric and research evaluation implications
We observed that potential cases of “Mendelism” are rare. Only 12% of our total set of scientists can be considered to be potentially suffering from this syndrome, but even in these cases there is no significant effect on the measured performance when compared to their colleagues. In other words, although it is possible to “wait” longer in order to increase the citation window, this does not imply that these researchers will significantly improve their positions in their scientific communities.
A policy-relevant implication of the results from this study is that the new bibliometric tools developed for the analysis of the durability of publications (Costas et al. 2010b) provide valuable insight into the phenomenon of durability of the scientific publications of scientists. Through these tools it becomes possible to detect deviant patterns of publication output among individuals. In this sense, our study provides policy makers and evaluators with a specific rule of thumb threshold value: scientists with shares of normal papers below 60–70% should be examined carefully in order to check the possibility of being potential cases of “Mendelism”. Furthermore, the improvement and further development of these bibliometric tools for the broader analysis of the durability of scientific literature can significantly contribute to detect cases of lack of recognition in science. This will enable us to analyze occasional claims of being “ahead of time”, and also in helping scientists who have output profiles with signs of delayed recognition to improve their communication strategies and thus contributing to a more sustained and fair evaluation practice.
Finally, more research is necessary in order to detect other possible factors influencing the durability of scientific literature, including aspects related with multidisciplinarity, applied versus basic research orientations, the availability of online journals and effects of the Open Access development, etc. The analysis of durability for the assessment of the performance of other units of aggregation (such as research teams, institutions, journals, etc.) will also be conducted in future studies.
Gregor Mendel whose discoveries in plant genetics (Mendel 1865) were so unprecedented that it took 34 years for the scientific community to catch up to it (Punnett 1907), although it has been even questioned if Mendel’s paper is a real case of delayed recognition (Moore 2001) or a case of “typical” science only “rediscovered” for a later scientific dispute. As suggested by one of the reviewers, this also raises the question if the role of a “prince” (in the “Sleeping Beauties” metaphor) is to rediscover the idea or rediscover the paper, an issue that is worth studying in the future.
The CSIC is organized in eight main research areas: Agriculture; Biology and Biomedicine; Chemistry; Food, Science and Technology; Materials Science; Natural Resources; Physics; and Social Sciences and Humanities. For a description of the WoS subject categories that are related with the three CSIC research areas analyzed in this study we refer to Costas et al. (2009).
“Normal” in this context refers to the property of these papers of being the majority in their fields, thus these papers determine the pattern of ageing that can be considered the most common, standard or “typical” in the field.
Katherine McCain (2011) has observed this property for two publications of the Nobel Prize winner John Forbes Nash.
It is important to remind here that bibliometric indicators are an important tool in research evaluation due to their quantitative and objective nature. However they only measure the dimension of the research activity, not accounting other activities such teaching, mentoring, patenting, consultancy, etc.
For the determination of “Years after publication”, the citation history of each document was analyzed and the citations of each paper to their corresponding year after publication were assigned (note that year 1 after publication refers to the same year of publication of the source paper). With this approach it is possible to work with all the papers of the set although they were published in different years.
Note that Figs. 4, 5, 6, 7, 8, 9, 1011 are based on the distribution of researchers and their values on the different indicators calculated. This means that although some publications can share researchers from different classes or ages, this has no influence on the results of the figures as they are based on the distribution of the individual scores of researchers.
The process consists on a first step where k centroids are defined, one for each cluster (the centroids should be placed as much as possible far away from each other). The next step is to assign each point to the nearest centroid, and when all objects have been assigned the centroids are recalculated. The two previous steps are repeated (iterations) until the centroids no longer move.
The same analysis has been performed with other sorting options and the results are basically the same.
The solution of three clusters can be viewed also as the best solution because when four clusters are forced, two of them tend to be very big, while the other two remain very small.
Natural Resources needed 4 iterations, Biology and Biomedicine 17 and Material Sciences 13 iterations, until no more changes in the centroids were found.
The authors are strongly grateful to Prof. Maria Bordons from CSIC and Martijn Visser from CWTS for their advice and comments on an earlier draft of this paper, as well as to the two anonymous reviewers who with their comments and views have significantly contributed to the improvement of the original manuscript.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
- Beed, C., & Beed, C. (1996). Measuring the quality of academic journals: The case of economics. Journal of Post Keynesian Economics, 18(3), 369–396.Google Scholar
- Bordons, M., Zulueta, M. A., Cabrero, A., & Barrigon, S. (1995a). Identifying research teams with bibliometric tools. Proceedings of the Fifth Biennial Conference of the International Society for Scientometrics and Informetrics (pp. 83–92). River Forest, IL, USA: Rosary College.Google Scholar
- Bordons, M., Zulueta, M. A., Cabrero, A., & Barrigon, S. (1995b). Research performance at the micro level: Analysis of structure and dynamics of pharmacological research teams. Research Evaluation, 5(2), 137–142.Google Scholar
- Cohn, E. G., Farrington, D. P., & Wright, R. A. (1998). Evaluating criminology and criminal justice. Westport, CT: Greenwood Press.Google Scholar
- Costas, R., Bordons, M., van Leeuwen, T. N., & van Raan, A. F. J. (2009). Scaling rules in the science system: Influence of field-specific citation characteristics on the impact of individual researchers. Journal of the American Society for Information Science and Technology, 60(4), 740–753.CrossRefGoogle Scholar
- Costas, R., van Leeuwen, T. N., & Bordons, M. (2010a). A bibliometric classificatory approach for the study and assessment of research performance at the individual level: The effects of age on productivity and impact. Journal of the American Society for Information Science and Technology, 61(8), 1564–1581.Google Scholar
- Costas, R., van Leeuwen, T. N., & van Raan, A. F. J. (2010b). Is scientific literature subject to a ‘sell-by-date’? A general methodology to analyze the ‘durability’ of scientific documents. Journal of the American Society for Information Science and Technology, 61(2), 329–339.Google Scholar
- Garfield, E. (1970). Would Mendel’s work have been ignored if the Science Citation Index was available 100 years ago? Current Contents, 2, 69–70.Google Scholar
- Garfield, E. (1980). Premature discovery or delayed recognition—Why? Essays of an Information Scientist, 4, 488–493.Google Scholar
- Garfield, E. (1989). Delayed recognition in scientific discovery: Citation frequency analysis aids the search for case histories. Essays of an Information Scientist, 12, 154–160.Google Scholar
- Garfield, E. (1990). More delayed recognition. Part 2. From Inhibin to Scanning electron microscopy. Essays of an Information Scientist, 13, 68–74.Google Scholar
- Garfield, E., & Malin, M. V. (1968). Can Nobel Prize winners be predicted? 135th Annual Meeting, American Association for the Advancement of Science. Dallas, TX: AAAS.Google Scholar
- Glänzel, W. (2008). Seven myths in bibliometrics—About facts and fiction in quantitative science studies. ISSI Newsletter, 4(2), 24–32.Google Scholar
- Glänzel, W., & Garfield, E. (2004). The myth of delayed recognition. The Scientist, 18(11), 8.Google Scholar
- Glänzel, W., & Schoepflin, U. (1994). A stochastic model for the ageing of scienific literature. Scientometrics, 30(1), 49–64.Google Scholar
- Kaufaman, L., & Rousseeuw, P. J. (1999). Finding groups in data: An introduction to cluster analysis (p. 113). New York, NY: Wiley.Google Scholar
- Line, M. B. (1993). Changes in the use of literature with time—Obsolescence revisited. Library Trends, 41(4), 665–683.Google Scholar
- MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Symposium on Mathematical Statistics and Probability (pp. 281–297). Berkeley, CA: University of California Press.Google Scholar
- McCain, K. W. (2011). Eponymy and obliteration by incorporation: the case of the “Nash Equilibrium”. Journal of the American Society for Information Science and Technology, 62(7), 1412–1424.Google Scholar
- Mendel, G. (1865). Versuche über Plflanzen-hybriden. Proceedings of the Natural History Society of Brünn (Vol. 4, pp. 3–47).Google Scholar
- Moore, R. (2001). The “Rediscovery” of Mendel’s work. Bioscience, 27(2), 13–24.Google Scholar
- Nederhof, A. J., & van Raan, A. F. J. (1987). Peer review and bibliometric indicators of scientific performance: A comparison of cum laude doctorates with ordinary doctorates in physics. Scientometrics, 36(2), 185–206.Google Scholar
- Sandström, U., & Sandström, E. (2009). Meeting the micro-level challenges: Bibliometrics at the individual level. 12th International Conference on Scientometrics and Informetrics (845–856). Rio de Janeiro: BIREME/PAHO/WHO.Google Scholar