Search strategies along the academic lifecycle

Horlings, Edwin; Gurney, Thomas

doi:10.1007/s11192-012-0789-3

Search strategies along the academic lifecycle

Open access
Published: 16 June 2012

Volume 94, pages 1137–1160, (2013)
Cite this article

Download PDF

You have full access to this open access article

Scientometrics Aims and scope Submit manuscript

Search strategies along the academic lifecycle

Download PDF

Edwin Horlings¹ &
Thomas Gurney¹

3624 Accesses
32 Citations
5 Altmetric
Explore all metrics

Abstract

Understanding how individual scientists build a personal portfolio of research is key to understanding outcomes on the level of scientific fields, institutions, and systems. We lack the scientometric and statistical instruments to examine the development over time of the involvement of researchers in different problem areas. In this paper we present a scientometric method to map, measure, and compare the entire corpus of individual scientists. We use this method to analyse the search strategies of 43 condensed matter physicists along their academic lifecycle. We formulate six propositions that summarise our theoretical expectations and are empirically testable: (1) a scientist’s work consists of multiple finite research trails; (2) a scientist will work in several parallel research trails; (3) a scientist’s role in research trail selection changes along the lifecycle; (4) a scientist’s portfolio will converge before it diverges; (5) the rise and fall of research trails is associated with career changes; and (6) the rise and fall of research trails is associated with the potential for reputational gain. Four propositions are confirmed, the fifth is rejected, and the sixth could not be confirmed or rejected. In combination, the results of the four confirmed propositions reveal specific search strategies along the academic lifecycle. In the PhD phase scientists work in one problem area that is often unconnected to the later portfolio. The postdoctoral phase is where scientists diversify their portfolio and their social network, entering various problem areas and abandoning low-yielding ones. A professor has a much more stable portfolio, leading the work of PhDs and postdoctoral researchers. We present an agenda for future research and discuss theoretical and policy implications.

Do grant proposal texts matter for funding decisions? A field experiment

Article Open access 19 May 2024

How to design bibliometric research: an overview and a framework proposal

Article Open access 06 March 2024

Literature reviews as independent studies: guidelines for academic practice

Article Open access 14 October 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

There is continuing interest in the micro-level dynamics of science, particularly to better understand how policy affects the science system. One of the most important problems in science policy concerns the definition and realisation of scientific priorities (Dasgupta and Maskin 1987). Governments, funding councils, universities, and individual researchers are continually searching for the most promising and dynamic areas. They use a wide range of instruments to shift resources and align research agendas, including funding opportunities, research coordination and power (Lepori 2011) as well as incentives and inspiration (Verbree et al. 2012b).

A major challenge of developing institutional and national research priorities is that science is a complex adaptive social system (Wagner and Leydesdorff 2005). Bound by the rules and structures laid down by government and the scientific community, national and institutional portfolios emerge from the simple rules that drive the behaviour of individual scientists and research groups. The results can be unexpected, even counterintuitive. Understanding how individual scientists apply those rules to build a personal portfolio of research is key to understanding outcomes on the level of scientific fields, institutions, and systems. How do scientists develop their research agenda? What is their search strategy? And where policy instruments are used to align or shift research agendas, how can we tell if they have successfully changed the behaviour of individual scientists?

Few exceptions notwithstanding (Laudel and Gläser 2008; Zuckerman and Cole 1994), there has been little attention for the way in which scientists develop a research portfolio in the course of their career. The most important obstacle is that we lack the scientometric and statistical instruments to examine the development over time of the involvement of researchers in different problem areas. In this paper we present a novel scientometric method to map, measure, and compare the lifetime corpus of individual scientists. We provide proof of concept by using this method to analyse the search strategies of 43 condensed matter physicists along their academic lifecycle.

In “Conceptual framework” we develop six propositions. The scientometric methods are developed in “Methods and data”. We then test our propositions in “Results” by applying the methods to data on the lifetime publications of 43 physicists. We analyse the results in “Analysis”. In “Conclusions and discussion” we summarise the main conclusions and discuss the implications.

Conceptual framework

The dynamics of the scientific search for new knowledge have attracted the interest of a wide variety of scholars for over half a century. Empirically, the main emphasis has consistently been on the development of researcher productivity along an academic career in relation to the main incentives (Stephan and Levin 1992). A similar line of research concerns star scientists, specifically what distinguishes stars from ordinary scientists (e.g. Zucker and Darby 1996; Zuckerman 1992) and how to identify them in terms of output and productivity in relation to age (Costas et al. 2010) and role (Bayer and Smart 1991). Lifecycles have been studied to examine changes over time in the productivity of researchers (Carayol and Matt 2004, 2006; Falagas et al. 2008; Levin and Stephan, 1991; Reskin 1977) and the activity profile of institutes (Braam and van den Besselaar 2010).

Research productivity deals with aggregate output, while search suggests the possibility of exploring multiple topics of research. An individual scientist’s portfolio reflects his curiosity and the opportunities with which he is presented, such as new talent in his group, new funding opportunities, emergent research themes, or simple serendipity. Scientific portfolio management is all about spreading risks, maximizing reputational gains, and satisfying personal (intrinsic and extrinsic) motivations.

The reward system of science

Since Merton’s more philosophical work on the reward system of science (Merton 1957, 1970), scholarly behaviour has been explained as a search for priority with reputation as a reward. Scientists strive to be the first to find an original result, while their peers in the field review and validate their work, thus assigning value and giving recognition. The reward system of science was further elaborated and empirically tested from a sociological perspective (e.g. Cole 1970; Cole and Cole 1967; Reskin 1979). Priority seeking and reputation were also at the heart of the new economics of science that arose in the 1990s (Dasgupta and David 1994; Stephan 1996; Stewart 1995). It is also now understood that scientists respond to a range of different, sometimes competing incentives, including the need to search for priority and establish a reputation, external demand for the results of a project, their own interest or curiosity, and—to a lesser extent—extrinsic rewards such as prizes, honours, and salary (e.g. Calderini et al. 2007; Verbree et al. 2012a).

Zuckerman and Cole (1994) show how the reward systems functions. They used interviews to find out if eminent scientists use different research strategies than ordinary scientists, which might account for their higher performance in numbers of publications and citations. The researchers Zuckerman and Cole interviewed selected problems based on three criteria: (1) how important they believe the problem to be as well as how their peers will respond when it is solved, (2) how easy or difficult it will be to solve the problem, and (3) how long it will probably take to get results. These criteria were then set off against the degree of competition around the problem.

“Are a good many others working on the problem and is the competition apt to be stiff? By and large, these established scientists say they consider it a waste of time to work on problems actively being pursued by others. […] Although the presence of competition may deter scientists from taking up a particular problem, it is apparently not sufficient to prevent them from doing so if the problem involved is judged of prime scientific significance—and if they think they can solve it first.” (Zuckerman and Cole, 1994, pp. 396–397)

According to Zuckerman and Cole, “eminent” scientists are more willing to engage the competition, while “rank-and-file” scientists prefer to avoid problems that others are working on. The complexity of the problem also matters. More complex problems take more time to solve. Such niches will be less crowded, but the potential reputational gains are higher. They will tend to attract fewer but more eminent researchers, thus raising the risk of being scooped.

Hagstrom (1974) used surveys to measure similar considerations twenty years earlier. He shows that the motivational drivers proposed by Merton and others actually work. Problem selection is associated with competitive intensity and personal ability to compete wherein the perceived intensity of competition varies by discipline and age. Hagstrom also noted possible perverse effects of intense competition, such as questionable conduct (Anderson et al. 2007) and an increase in secrecy among researchers (Hong and Walsh 2009). He predicted that the nature of scientific competition might change if the social organisation of science changed, for example in response to the rise of big science. Several authors witness such an increase in competition. For example, Rauber and Ursprung (2008) examined the productivity of different age cohorts of German economists over time. Their results show the increasing competitiveness of academia: younger cohorts are far more productive than older cohorts. Hessels (2010) shows that competition is becoming more intense. Within science, there is an ever stronger pressure to publish or perish. At the same time, there is an increasing call for science to produce socially (or economically) relevant knowledge.

Community

The literature on the reward system of science shows that problem choice is the main instrument of competition between an individual scientist and his peers. Problem choice is driven by the possibility of gaining reputation. Yet, reputational gains require a community of peers who work on the same or similar problems and can recognize achievement.

The work of Zuckerman, Cole, and Hagstrom suggests that scientists select problems based on their private perception of the trade-off between community size and potential marginal reputational gains. It is relatively easier to be the first to find a result in a very small niche, but there will be fewer peers to recognise the achievement. In a very large or crowded niche, achieving priority is much more difficult, but the potential gains may be much higher than in a small niche. The search for a niche and the need for a community that grants a reputation are consequently at odds. Perceptions of the trade-off will change over time, owing to the accumulation of reputation, changes in academic status, and experience. It may be easier for a reputable scholar than a rank-and-file scientist to produce a high-impact contribution in a crowded field and the same may be true for publishing work on an entirely new, self-defined problem. Highly complex problems attract fewer but eminent researchers, while rank-and-file researchers crowd specialties of less complex problems (Zuckerman and Cole 1994).

The decision space of scientists is bounded by the requirements of their global community of peers and by local conditions and opportunities. Whitley (1974) refers to the degree of cognitive institutionalisation of scientific fields, which depends on “the degree of consensus and clarity of formulation, criteria of problem relevance, definition and acceptability of solutions as well as the appropriate techniques used and instrumentation.” (Whitley 1974, p. 72) Where the quest for priority—i.e. originality—creates divergence and task uncertainty, the need for reputational gains forces scientists to conform to a community of peers who can validate and replicate their work (Whitley 2000). This community structures the search for new problems. In other words, for a scientist to be able to compete he has to have competitors with whom he has to reach a level of consensus on the basic premises of the problem area.

The interaction between a scientist and his community can be seen as a way to gather the resources necessary for doing research on a particular problem (cf. Pfeffer and Salancik 1978). By collaborating with their peers, scientists gain access to crucial resources. High academic reputation, specific expertise, and access to facilities, equipment, and data are the fuel for preferential attachment (e.g. Birnholtz 2007; Bozeman and Corley 2004; Melin 2000; van Rijnsoever et al. 2008). We can also take a more sociological view. In the words of Lave and Wenger (1991), scientists form communities of practice around problem areas, in specialties, and in fields. Scientists learn from each other by sharing knowledge, for example at conferences or through collaboration. The communities they form have a formal dimension—think, for example, of academic associations—but more often they are self-organising or emergent (e.g. Brown and Duguid 1991). From this perspective, it is interesting to note the rise of team science around some of the hardest problems in science (Stokols et al. 2008).

Problem choice is one of the ways in which individual scientists strategically position themselves in a wider environment. They develop their own research topics that latch onto problem areas defined by their community or by society at large as expressed in public debates or in funding opportunities. Problem choice, the accumulation of reputation, community development, and the collection of resources are dynamically interrelated.

The economic model in which behaviour is driven by individual rules and preferences in interaction with an outside environment provides a good understanding of problem choice in science. It is important to keep in mind that models are a simplification of reality. Most scientists are motivated by more than possible reputational gains but are also intrinsically motivated. Models cannot capture all dimensions of their behaviour. This is particularly true for curiosity, creativity, and serendipity, that introduce a degree of randomness in problem choice. However, even though each individual scientist will have a private heuristic, at system level or across large populations of researchers the model will hold.

Search strategies

The aim of our analysis is to characterise the search strategies of scientists. “Search” denotes the process by which an individual scientist identifies, enters, develops, and exits a problem area and its associated community of peers. We refer to a scientist’s activity in one problem area as a research trail. “Strategy” refers to the scientist’s strategic positioning in a competitive environment and presumes a degree of planning, coherence and consistency to problem choice over time. We expect search strategies to evolve along the academic lifecycle. The search process and the strategy behind it are dynamic and interrelated, each developing in response to changes in status and position, the availability of resources and access to social networks, the constraints imposed by prior work, and unexpected findings and opportunities. We can see an individual scientist’s search strategy as the way in which he negotiates his way through Bonaccorsi’s search regimes (2008).

Propositions

In this section we develop a number of propositions with regard to the social and cognitive dynamics in the work of individual scientists that our method should be able to measure.

Proposition 1

A scientist’s work consists of multiple finite research trails

Scientists develop along an academic lifecycle. As they age, scientists gain experience, develop a set of skills and achieve reputational gains, which affects their ability to gain access to critical resources for the next problem (Kyvik and Olsen 2008). And problems can be solved, if not by the researcher himself then by his peers.

Proposition 2

A scientist will work in several parallel research trails

Scientists can be active in different niches, each with its specific consensus on the most important problems and the state-of-the-art of data, resources, standards and criteria. If task uncertainty is high then consensus among peers is low and scientists can earn recognition for a more diverse set of problems (Whitley 2000). This implies that a scientist may work for different communities of peers, using different funding sources, and working with different networks of collaborators (Zuckerman and Cole 1994, pp. 398–399). Flexibility in problem choice does depend on access to resources, including a team of researchers (and their inherent knowledge and skillsets), funding, facilities, and data.

Proposition 3

A scientist’s role in research trail selection changes along the lifecycle

Scientists progress up the academic hierarchy from PhD to postdoc to professor. This progression changes the role they play in problem choice. In addition, the lifecycles of individual researchers overlap and they interact while at different stages of their individual lifecycles (Dietz et al. 2000). A good example is that of the PhD student and his supervisor. At the start of a scientist’s career, his supervisor may assign him a problem to solve, while as professor he assigns problems to his PhDs. Over time, individual autonomy with respect to problem choice will increase and involvement in problem areas will change from first-hand involvement in one narrow problem area to a supervisory role in a range of problem areas. As research leaders, individual scholars supervise, inspire, and manage a collective of researchers at varying levels of academic development.

Proposition 4

The start and end of research trails is associated with career changes

The behaviour of scientists changes over time. Scholars rise in the academic hierarchy, move between institutions, and develop a social network. Verbree et al. (2012a, b) show that the behaviour of medical research group leaders varies according to their age, the phase of their lifecycle (especially when they near retirement), and the dominant incentives in the science system during their PhD phase. As their status changes—from PhD student to postdoctoral research to full professor—so does their role in agenda setting. A higher function provides new capabilities. Moving to a different institute gives access to new expertise, better facilities and support, a different environment. Academic careers are not uniform. Dietz and Bozeman (2005) show that there are different paths of academic advancement and that the nature of careers has changed over time. They also find that job changes boost researcher productivity.

Proposition 5

The start and end of research trails is associated with the potential for reputational gain

Scientists enter new niches in the hope of accumulating additional reputation. The implication is that if the niche does not deliver, they will abandon it. Behaviour may change along the lifecycle. As Hagstrom (1974) put it: “the marginal value of each discovery is greater for younger men”. Also, as a scientist accumulates reputation, the probability that other scientists will want to collaborate (and co-author) for work in the same area and in adjoining areas rises (Melin 2000). Activity in a problem area may become self-sustaining.

Proposition 6

A scientist’s portfolio will converge before it diverges

Problem choice is a non-random process. Each decision is linked to the previous and we can expect most research trails to be connected. After all, a scientist builds on his accumulated stock of expertise, reputation, network relations, and resources. A succession of research trails creates path dependency. We expect to find that after a scientist finds his core niche and discovers his reputational blockbuster, his portfolio will tend to converge. Only later in his career, when the potential for marginal reputational gains as well as the risk of entering new niches goes down, will divergence occur.

Methods and data

In this section we explain how we map an individual scientist’s portfolio over time and which data were used to do the mapping. In the “Results”, the method will be applied to empirically test the propositions.

Method

There are several ways to map the structure of scientific fields. Boyack and Klavans (2010, 2009) use co-citation analysis to map the grand structure of science. They identify current paradigms and their relative position in the entire body of scientific output. A similar but different method is bibliographic coupling (e.g. Jarneving 2007). Science overlay maps allow researchers to reveal the disciplinary orientation of a researcher, country, institution or field by mapping the subject areas in the relevant body of output on the journal structure of science (Rafols et al. 2010). Such maps show structures that emerge from underlying social and cognitive dynamics that have been studied in the literature since the 1950s. We know that the dynamics of science can be traced to the behaviour of individual scientists. Yet, we lack the methods to map and measure those dynamics.

To meet the requirements of our analysis, we need to adjust extant methods in two ways. First, we must find a way to map paradigms or clusters in a set of publications over time, showing both the development of each individual cluster and the degree of similarity between clusters. Second, our method must capture the search for priority as well as the need for a community. What we want to measure is thematic selection and strategic positioning: selection in context. This means that we are looking for two dimensions. Along the first dimension an individual researcher demarcates his discrete epistemic niche. This describes what he or she actually researches. Along the second dimension a researcher connects to a specific community of peers. These are the people who study the same problem and who review, validate, and replicate their work.

Van den Besselaar and Heimeriks (2006) have developed a method that measures along these two dimensions. They map the structure of a field by measuring the similarity between publications in terms of shared combinations of title words and cited references. Title words catch the first dimension by describing the contents of a publication. Cited references capture the second dimension. Authors who refer to the same body of literature work in the same research tradition. They form a community of peers. The result may be considered a proxy for the epistemic culture to which a researcher belongs. Title words and cited references capture the researcher’s contribution to ‘what we know’, the “signature of the knowledge claim” (Lucio-Arias and Leydesdorff 2009), while his reputation is decided by a collective of peers in the same community (Knorr-Cetina 1999).

We map the structure of an individual researcher’s lifetime portfolio using title word-cited reference combinations to calculate the similarity between publications. The SAINT toolkit (Somers 2009) was used to transform raw data into networks of similar papers for analysis and visualisation. The ISI parser turns raw Web of Science data into a relational database. This allows the user to examine any possible combination of data. The Word Splitter parses titles and abstracts into individual words, providing a stemmed version of each word using the Porter stemming algorithm (Van Rijsbergen et al. 1980) and removing user-defined stopwords. From the relational database, we extract combinations of stemmed title words and cited references. For each pair of publications A and B, the Tanimoto coefficient (a derivative of the Jaccard similarity coefficient) τ is calculated:

$$ \tau ( {\text{A,B)}} = \frac{{N_{\text{AB}} }}{{ (N_{\text{A}} + N_{\text{B}} - N_{\text{AB}} )}} $$

where N A is the count of word-reference combination tokens in A, N B is the count of tokens in B and N AB is the count of tokens shared between A and B. This gives us the basic data needed to construct a network consisting of a set of publications (the nodes) and a similarity between each pair of publications (the edges).

The Community Detection Tool within SAINT uses the community detection algorithm of Blondel et al. (2008) to demarcate clusters of highly similar publications within the network.^{Footnote 1} Blondel et al.’s method identifies sets of highly interconnected nodes within large networks producing a community structure with high modularity (i.e. high density of links within communities and low density of links between communities). Their algorithm has three distinct advantages: (1) it is a multi-level algorithm that shows the hierarchical structure of the network and allows analysis of communities at different levels of aggregation; (2) it is able to detect very small communities; and (3) it resolves very rapidly across large networks. In our analysis, communities represent specialties within the academic corpus of researchers. Some publications have no similarities to other publications in a corpus. The community detection algorithm isolates these into single-node communities. These papers have been ignored.

We focus only on citable documents, that is, articles, conference proceedings, letters, notes, and reviews.^{Footnote 2} Citeable documents are the foundations of an academic career: it is through citation that we can measure the marginal reputational gain that a publication produces. Where we map portfolios and scale the size of individual publications and where we analyse reputational gains, we will use average annual citations received until the time when we downloaded the data (March–May 2011). Total citations will tend to overestimate the impact of older publications. This is of course an oversimplication of the problem, as can be read in numerous articles about citation ageing (Bouabid 2011).

CV data are used to differentiate between the PhD, postdoctoral and professorial phases of an academic career as well as to identify the major career moments. Major career moments include changes in position (e.g. postdoctoral researcher, associate professor) and moving between institutions; visiting scholarships, honorary chairs and other similar positions are ignored. Associating CV dates with publication dates requires an adjustment for the lag between submitting a paper—the final stage of doing the actual research—and its publication in a journal or conference proceeding. It takes time to set up a project, acquire funding, hire researchers, do the work, write papers, and get them published. For each analysis we have tested the effects of different lags. In this paper, we present the results for a 2-year publication lag.

The challenge is to show how one individual scientist has developed his portfolio over time, possibly working in several problem areas simultaneously, some of which are similar and others dissimilar. Mapping the lifetime corpus of an individual scientist in two-dimensional space produces visualisations that tend to look like those in Fig. 1. Each node is a publication. The size of a node indicates the annual average number of citations received from the date of publication until the moment of downloading, thus normalising for the fact that older publications have had more time to accumulate citations. The colours of the nodes represent the different problem areas in the scientist’s corpus. Each edge represent a similarity and has a weight equal to the degree of similarity (the Tanimoto coefficient). The nodes are positioned using the Force Atlas layout, a variant of the Fruchterman–Reingold force-directed algorithm embedded in Gephi (Bastian et al. 2009). Nodes that are highly similar are clustered close together; nodes with low similarity or no similarity are positioned farther apart.

Figure 2 presents a novel method for mapping a corpus of publications. The information in this figure is identical to Fig. 1. What is new is that we have arranged the publications along two axes, that of time on the x axis and problem areas on the y axis. Longitude is defined as [(year of publication) − (year of first documented publication in the dataset)/(range in years) × 360] − 180. Latitude is defined as [(community number)/(total number of communities) × 180] − 90. The nodes were positioned with the GeoLayout in Gephi, using an equirectangular projection. Since this positions every publication in one problem area and one year on exactly the same location, we use the Noverlap function to force the nodes to be shown side by side. This method allows us to view over time the emergence and development of activity in different problem areas, showing potential overlaps and interrelations.

Three caveats must be made.

1.
Self-citation Self-citations may artificially imbue coherence to the lifetime corpus of one individual. In the first publication, the researcher cannot refer to own publications. Later in his career he may use self-citation to raise the impact—and hence reputation—of his work (van Raan 2008b). Similarly, when he enters a new niche of limited size—either self-defined or following a few prior publications—there are few other publications he can refer to. This is precisely where we find the difference between mapping problem areas in a wider community and mapping problem choice by an individual scientist who strategically positions himself in his global and local environments. It is the latter that we want to map and for this reason self-citations are included. An interesting approach, similar to ours, is presented by Hellsten et al. (2007) who use self-citations to trace how a scientist moves between research topics. The difference is that a method based on self-citations seems less accurate for early-career researchers.
2.
Aggregation Our method maps problem choice by individuals. This represents the lowest level in Whitley’s (1974) scheme of aggregation in which a field consists of specialties that consist of research areas that contain problem situations. The research trails that we identify represent the individual scientist’s selection of problem areas that relate to research areas at higher levels of aggregation. Klavans and Boyack (2011) find that global maps of research fields are more accurate than local maps, which seems to argue in favour of mapping individual portfolios in their global context. However, our method provides a highly fine-grained and individual perspective of the way in which a scientist develops a portfolio. In addition, it also works on very small data sets.
3.
Who decides and how Our method presumes that problem choice is a decision of the individual scientist. It consequently disregards group publication strategies, hyperauthorship (e.g. in particle physics), and the rise of team science. Also, there will be disciplinary or community differences in publication and citation cultures (van Raan 2008a; Wouters 1999; Zuckerman 1987) and a scholar’s role in problem selection is likely to change over time.

Data

A sample of individual scientists was constructed to develop the methods and extract statistical measurements. We focus on a single specialty, namely condensed matter physics, and start our search with distinguished scholars who work or have worked at high-magnetic field labs in Tallahassee, Nijmegen, and Dresden that serve as focal points in the field. In addition, we extracted the top-25 American, Dutch, and German authors from ten important journals in condensed matter physics (Physical Review B, Journal of Physics-Condensed Matter, Thin Solid Films, Physica B-Condensed Matter, Advanced Materials, Applied Surface Science, Surface Science, Journal of Magnetism and Magnetic Materials, Journal of Nanoscience and Nanotechnology, and Physica Status Solidi B-Basic Solid State Physics). From the list of potential candidates, we first selected those with a long and distinguished career and good CV data. In addition, some of the selected physicists have had a shorter career or have achieved a less exalted status in their field than others. The result is a sample of 43 condensed matter physicists.

The lifetime publications of the 43 physicists were downloaded from Thomson Reuters Web of Science.^{Footnote 3} The publications retrieved from the Web of Science were manually checked to ensure that they belong to the work of the selected physicists. This check involved a comparison with the scientist’s curriculum vitae, an analysis of the subject areas of the retrieved papers, and a comparison with lists of publications on personal websites and in CVs. The corpus is not necessarily complete. The Web of Science does not include every single academic publication. For example, some of the early publications of Russian physicists—for their PhD and the Doctor of Science theses—have only been published in Russian and may not be included in the dataset. Also, the Web of Science has expanded over time, so coverage today is better than it used to be at the beginning of the careers of our subjects. However, overall coverage is sufficient for the purposes of our analysis.