Are evaluative cultures national or global? A cross-national study on evaluative cultures in academic recruitment processes in Europe

Studies on academic recruitment processes have demonstrated that universities evaluate candidates for research positions using multiple criteria. However, most studies on preferences regarding evaluative criteria in recruitment processes focus on a single country, while cross-country studies are rare. Additionally, though studies have documented how fields evaluate candidates differently, those differences have not been deeply explored, thus creating a need for further inquiry. This paper aims to address this gap and investigates whether academics in two fields across five European countries prefer the same criteria to evaluate candidates for academic positions. The analysis is based on recent survey data drawn from academics in economics and physics in Denmark, the Netherlands, Norway, Sweden, and the UK. Our results show that the academic fields have different evaluative cultures and that researchers from different fields prefer specific criteria when assessing candidates. We also found that these field-specific preferences were to some extent mediated through national frameworks such as funding systems.


Introduction
Academia has always been an international endeavor as disciplines transcend national borders, and scholars collaborate internationally. However, this trend has increased in recent years, and academia has become even more globalized, with flows of international students, researchers, and an international academic job market in which universities compete for the best researchers. Not only are universities actively recruiting foreign faculty to build their international reputation, but also individual researchers are actively using their international network to recruit highly qualified postdocs and PhDs to fill their research and teaching needs outside their country (Ortiga et al. 2020). In Europe, the Bologna Process 1 3 isomorphism, which suggests that the internationalization of universities and the evaluation processes preferred in recruitment are embedded in academic fields and lead to national similarities in evaluative criteria preferences, and (ii) path dependency, which suggests that higher education institutions are embedded in a national context that generates a countryspecific culture regarding researchers' preferences for different evaluative criteria.
To examine these elements, we applied new and original survey data from 2017/2018, including responses from economics and physics academics in universities in five different European countries (Norway, Sweden, Denmark, the Netherlands, and the UK). In the following sections, we first review prior studies on academic recruitment and then formulate expectations based on arguments about the role of national contexts and isomorphism. We then present data and methods before turning to the analysis. We summarize our results in the "Discussion" and "Conclusion" sections and offer avenues for further research.

Evaluative criteria in recruitment
Recruitments are fundamental organizational processes as well as gatekeepers that control and determine organizational membership and social boundaries. These processes are salient in organizations that must recruit the right kind of people for both technical and symbolic reasons (Scott and Davis 2007). In these processes, the application of evaluative criteria is pivotal, since they represent core peer-review processes (Langfeldt and Kyvik 2011) that are controlled and conducted by the academic profession (Musselin 2010). In these evaluations, candidates are assessed on multiple criteria, such as teaching experience (Levander et al. 2019), international experience and language skills (Herschberg et al. 2018), administrative skills (Hamann 2019), or social skills (Musselin 2010); therefore, the desired candidates are often referred to as "the sheep with five legs" or "jack-of-alltrades" (Van den Brink and Benschop 2011). Nevertheless, research output is often the most salient criterion (Van den Brink and Benschop 2011), although teaching experience has recently been gaining importance (Levander et al. 2019).
However, research quality and academic qualifications are not fixed entities but socially constructed and negotiated among academics in peer-review processes (Langfeldt et al. 2019). Each field has its own evaluative culture with its own understanding of academic qualifications and research quality that is tightly linked to its identity, epistemology, and academic work (Becher and Trowler 1989;Lamont 2009). In these different evaluative cultures, qualities are valued differently. For instance, humanities scholars have been found to define interpretative skills as highly important, while this quality has a more negative connotation in the social sciences (Lamont 2009).
Different evaluative cultures have also been identified in the evaluation of candidates in recruitment processes. Researchers in economics, for example, place more emphasis than researchers in biomedicine on how many publications a candidate has secured in top journals (Hammarfelt and Rushforth 2017). Furthermore, international experience is more highly valued in the natural sciences than the social sciences (Herschberg et al. 2018), and teaching experience generally seems to be more strongly emphasized in Science Technology Engineering and Mathematics (STEM fields) than in humanities or social science (Levander et al. 2019). Yet, despite prior studies addressing these field-based characteristics of evaluative criteria, the empirical evidence is still scarce.
The use of evaluative criteria is not only regulated by academic fields but also national academic career structures with different types of positions and diverse obligations, career 1 3 paths, and recruitment procedures (Alfonso 2016;Sanz-Menéndez and Cruz-Castro 2019). In Musselin's (2010) study, she found that the national context, including different formal and informal procedures, to some extent shapes the use of evaluative criteria. The American custom of inviting candidates to a visit, including lunch and dinner, provides greater opportunity for evaluating their personalities, and contrasts with the European approach of recruiters and candidates having more limited social encounters through more formal interviews. Similarly, the German custom of hiring candidates with long careers increases recruiters' expectations of scholarly output compared with the French or American traditions of hiring younger applicants (Musselin 2010). At the same time, the national differences in evaluative cultures are somewhat less sensitive to external logics. This paper does not aim to provide a complete description of national systems and regulations, but their diversity is an important premise that shapes national contexts in this study and, thus, potentially influences the way evaluative criteria are applied. We will return to this in the "Methods" section, in which we elaborate upon the selection of countries and fields for our study.
Hitherto, we have discussed how different evaluative criteria play decisive roles in academic recruitment. However, recruitment processes are complex undertakings in which peer evaluation and evaluative criteria are only two of many factors influencing the final selection. Studies have shown that recruitment is not always meritocratic (Nielsen 2016) where candidates are selected based on a set of fixed evaluative criteria (Musselin 2010). Academic inbreeding is common in many countries (Altbach et al. 2015;Tavares et al. 2019), and social networks enhance both academic careers (Pezzoni et al. 2012;Rossier 2020) and candidates' chances of success in recruitment (Combes et al. 2008;Lutter and Schröder 2016). The literature has also shown that recruitment processes can be gender biased (Husu 2000;Nielsen 2016;Wennerås and Wold 1997). Finally, recruitment processes are social processes in which the ranking of candidates may be strategic (Musselin 2010), and evaluators need to legitimize their conclusions (Hamann 2019). In particular, the final selection of the highest-ranked candidates is often hard to explain in terms of evaluative criteria (Musselin 2010). Although these criteria may not explain why a candidate was ranked highest, they reflect how evaluation committees select a shortlist of candidates, as they offer a basis for peers' discussions of candidates and the criteria used in arguing for them, even in cases in which this is window dressing (Musselin 2010). Moreover, the emphasis on evaluative criteria sends strong signals to the research community regarding which qualifications are more important than others (Tagliaventi et al. 2020), and it has thus been shown to be important in the selection of candidates (Herschberg et al. 2018;Van den Brink and Benschop 2011). Hence, we focus on evaluative criteria as more openly expressed factors affecting candidate selection. At the same time, we acknowledge that there are other more subtle or informal factors that also influence the selection of candidates; these must be studied with other methods and are therefore not part of our analysis.

National context
Despite common organizational features, universities are created and embedded in highly diverse national traditions and governance arrangements that have generated persistent differences between universities in different countries (Clark 1983). These systems, with their 1 3 rules, norms, and traditions, consistently shape universities (Whitley 2003). The national contexts underline the resilience of institutions in universities, and (Colyvas and Powell 2006), for example, showed that new procedures at universities must pass through several phases before gaining sufficient legitimacy to overcome initial resistance.
In a similar argument, the historical institutionalist literature stresses the importance of historical developments on today's decisions and future organizational paths (Mahoney and Thelen 2009;Thelen 1999). This perspective highlights the fact that temporality and context matter for decision-making costs and assessments of alternatives, and it indicates that institutional structures lead to path dependency, feedback mechanisms (Pierson 1993(Pierson , 2004, and lock-in effects (Sydow et al. 2009). Thus, higher education research often argues that change represents one of the primary challenges for universities, and when change materializes, it proceeds incrementally and mostly through organizational layering (Clark 1983). For our study, this understanding implies that even if there is more internationalization in academic labor markets and recruitment as well as an increasingly global disciplinary community, specific national norms, values, regulations, and structures should still matter in the way evaluative criteria are mobilized by academics.
In this perspective, national legal frameworks and funding arrangements create longlasting differences between universities in different countries and between national academic labor markets, which in turn can be expected to influence the way how academics embedded in these environments approach evaluative criteria. For example, in many European countries academics in public universities are at least partly regulated by laws that govern public sector employment. Similarly, Aagaard (2015) has demonstrated that national performance-based evaluation systems affect researchers' assessments of peers. Thus, even though academic labor markets have become more international in recent years, it can be assumed that national contexts still matter for recruitment processes. Our first expectation is therefore that researchers in different countries prefer distinctive evaluative criteria due to the specific national, historical, and cultural environments in which universities operate.
Expectation 1: Researchers from similar fields in different countries have distinct preferences regarding evaluative criteria in recruitment processes due to the specific national context in which they operate.

Internationalization
Despite universities' national embeddedness, academia has always been an international endeavor. In the last decades, researchers in organizational studies have highlighted how global reforms have spurred universities to become more alike and linked this development to a process of bureaucratization in which organizations in the same field converge as the field matures (Bromley and Meyer 2015;DiMaggio and Powell 1983;Meyer and Rowan 1977;Ramirez 2006). DiMaggio and Powell (1983) describe this as a process of homogenization or isomorphism in which organizations are drawn toward compatibility with other organizations in the same field. In this understanding, isomorphism is linked to shared values, organizational structures and common beliefs that spread over time throughout the organizational field. The organizational field of universities has a long history of wellestablished and shared beliefs and rules (Meyer et al. 2007) that define what is perceived as appropriate and what signifies prestige and standing in the academic community. At the same time, especially in recent decades one can observe an increasing prominence of internationalization and a shift in the dominant discourse on higher education (Buckner 2017).

3
These newly highlighted shared values and norms connect universities across national boundaries and create isomorphic pressures that drive organizations to adapt in order to gain and retain legitimacy (DiMaggio and Powell 1983). Adaption may occur through reform alternating actor's authority as coercive isomorphism or through mimetic isomorphism without structural change (Marini 2020). One example on the latter is that university leaders are increasingly occupied by recognition through league tables, rankings, and other international evaluative instruments (Paradeise and Thoening 2015;Sauder and Espeland 2009). Given the global rationalization of universities (Ramirez 2010) and how recognition and prestige are related to a global organizational field in higher education (Driori et al. 2003;Krücken and Meier 2006), growing internationalization can be expected to give rise to converging preferences for the use of evaluative criteria, thus leading to a more integrated global academic labor market.
While evaluation processes in academic recruitment are conducted in the contexts of universities, they are also embedded in their disciplinary fields (Clark 1978;Lamont 2009) and controlled by peers (Musselin 2010). The fields have their own evaluative cultures closely tied to their epistemological traditions and academic work (Becher and Trowler 1989;Lamont 2009;Välimaa 1998). These are also found in evaluation processes in recruitment in which the fields employ specific evaluative criteria when assessing candidates (Herschberg et al. 2018). Moreover, academics strive primarily for peer-recognition and prestige within their fields (Clark 1983;Driori et al. 2003;Hessels et al. 2019). While disciplinary fields have always had an international orientation (Lamont 2009; Langfeldt et al. 2019), they have in the last decades experienced increased internationalization through expanded participation at international conferences, journals, and academic training arrangements (Whitley et al. 2010). Hence, due to evaluation processes' embeddedness in values and norms of internationally oriented fields, one could expect convergence and isomorphism of preferences regarding evaluative criteria within each field independent of the national context (Buckner 2019;Ramirez 2006).
Expectation 2: Due to increased internationalization and the related isomorphism of disciplinary fields, researchers in the same fields in different countries prefer similar evaluative criteria in recruitment processes.

Empirical context: fields and countries
We selected physics and economics as fields because we expected them to have different criteria and standards for evaluating candidates. As noted by others, economics has a special status in the social sciences; recruiters place considerable value on candidates' publications in top journals, and there is a high level of internal consensus on mainstream or neoclassical economics, which is sustained by a highly international knowledge community (Hylmö 2018;Lee et al. 2013). Furthermore, researchers in economics and social science in general are less functionally dependent on the work of their colleagues (Whitley 2000). In physics, on the other hand, we assume researchers to be dependent on the results and methods of others and research to be a more collaborative effort (Välimaa 1998;Whitley 2000). The two fields have furthermore different scientific publication practices where scientific publications in physics (particularly those building on large experiments) often have a high number of coauthors, while there are relatively few co-authors on publications in economics. The different ways of doing research may further have a bearing on the fields' notion of 1 3 quality and preferences for the use of evaluative criteria in hiring processes (Välimaa 1998). Therefore, we assume that both fields apply different evaluative criteria where researchers in economics should place stronger emphasis on the importance of metrics, while researchers in physics, to a larger extent, emphasize candidates' research profiles in order to assess how they might fit into a research group. This difference between the two fields has also been documented by recent qualitative studies building on confidential recruitment reports (Reymert 2020). Finally, both fields are characterized by a high degree of internationalization. While this increases the likelihood of finding field-specific effects compared with, for example, selecting a more national field, such as history or literature, internationalization is also something of a prerequisite that enables us to distinguish between national effects and field effects as highlighted in expectation 2. At the same time, these fields could be described as least likely to have national differences; therefore, differences that can be found between countries are especially relevant.
Globally, academic recruitment is organized in very different ways. To ensure relatively similar research conditions, we chose to compare North-Western European countries, namely Denmark, the Netherlands, Norway, Sweden, and the UK. These countries have well-developed and well-funded higher education systems, and they have different national contexts and performance-based research funding systems, which we assume to affect the use of evaluative criteria. Sivertsen (2017: 2) identified four ideal types of such systems, and the five countries in our study cover all of them: We assume that the varying emphasis on citations, publications, and education could impact academics' preferences for the use of evaluative criteria in hiring processes in line with expectation 1. One reason for this is that the national funding systems set differing incentives when considering which criteria are the most promising from a funding perspective.

Data and methods
To investigate our research question, we used a Web-based survey, which was distributed to researchers in 2017/2018. Our target population was academic staff that had been involved in recruitment processes. To generate the respondent list, we pursued a two-step strategy in which we combined journal classification (Web of Science (WoS)) and organizational units to delimit the sample. In total, 59% of respondents were identified from staff lists 1 3 and 41% from WoS data. We removed respondents who declined to participate, those who were outside the target group, and those with nonfunctioning email addresses. The survey achieved an overall response rate of 33.6%, varying from 11.4% for economics in the UK to 57.3% for economics in Norway (see Table 1). The survey was part of a larger international research project, and we have included more information on the survey and its representativity in the Appendix.
Since we aimed to investigate academic's preferences of evaluative criteria in academic recruitment, we singled out respondents who reported participating in recruitment processes (848 of 1697 respondents). Most respondents included in our sub-sample were professors; 80% were male, and more than half were between 40 and 59 years old. Table 2 shows their background information.

Dependent variables
In the survey, we asked the respondents to think about their last assessed candidate and identify which type of position they had assessed for: junior, senior, or other. They were also asked to indicate the importance of 13 predefined evaluative criteria on a scale of "Not important," "Somewhat important," "Highly important," and "Do not remember/cannot answer." In the analyses, we combined "Somewhat important" and "Do not remember/cannot answer" into a neutral category, and for the regression analysis, we computed dummy variables in which "Highly important" was assigned a value of 1 and other answers a value of 0.
Since prior studies have shown that some evaluative criteria are more important than others, we also asked the respondents to select the most important aspect of all the criteria they identified as "Highly Important." The questions in the survey were based on prior research on academic recruitment processes (Herschberg et al. 2018;Levander et al. 2019;Van den Brink and Benschop 2011) and literature on research quality evaluations (Lamont 2009). We also allowed respondents to define their own criteria if they desired, but only a few submitted self-defined criteria, so additional categories were not constructed. As shown in Table 3, we compressed the descriptions from the survey into shorter abbreviations, which we refer to in the "Discussion" section.
One limitation of surveys is that they do not offer detailed answers. In this study, the category Future Potential posed some interpretation challenges, as it may have included issues such as future potential in terms of research contributions, teaching, or being a good colleague. However, since the survey primarily addressed issues around research and conditions, we believe that most respondents associated this category with research-related practices. We thus understand Future Potential as an indirect reference to future research contributions.
We first used bivariate correlation patterns before applying logistic regression analysis with control variables.

Independent variables
In the logistic regression analysis, we controlled for the country in which the respondents worked and their field affiliation. We further controlled for whether the respondents Teaching experience/achievements (including supervision of students) Third mission experience Experience in interacting with the public/users/industry Third mission work experience Experience/achievements from work outside science, e.g., professional/ clinical practice, industry or public administration 1 3 recruited for a junior or senior position and for background variables, such as gender, age, and their own academic positions. We refrained from analyzing institutional differences because we had relatively few respondents from each institution (only one institution had more than 17 respondents in one of the fields).

Methods
We analyzed the data using R. 1 To ensure equal field and country compositions in the bivariate data presentations, we developed and applied weights (see Appendix Table 5).

Binary logistic regression analyses
To analyze country and field effects on which evaluative criteria researchers prefer, we applied binary logistic regression analysis with the different evaluative criteria as independent variables. Before conducting the analyses, we investigated the Pearson r correlation to control that none of the independent variables were highly correlated (see Fig. 1 in the Appendix). We applied ANOVA tests to investigate whether the independent variables contributed significantly in explaining the variance of the dependent variable (Agresti 2013) and conducted AIC and BIC tests to detect the models that were most suited to explaining this variance (Agresti 2013). The best-suited models are shown in the paper, while the others are available in the Appendix. All binary logistic regression models were conducted with the different countries as baseline categories to map country effects, but only the models with the Netherlands as a baseline category are shown since the Dutch respondents had the most deviant answers; hence, these models show most of the significant effects that we discovered. To investigate interaction effects between country and field, we conducted separate regression analyses for physics and economics instead of including interaction terms in the regression models because of the related problems of including interaction terms in logistic regression models with relatively low numbers of available observations (Mood 2009). 2

Results
The respondents identified multiple evaluative criteria as important, and only a few criteria were classified as irrelevant. For instance, the only criterion that was identified as not important by more than half of the respondents was Third Mission Activities. However, some criteria were more important than others, and respondents placed the most value in Future Potential, Matching Field, General Impression, and Important Research Contribution.
1 The RMarkdown file is available on request. 2 Physics is a large and heterogenous field. Some researchers depend on large international infrastructure (such as ATLAS), while others primarily work by themselves and without large equipment. To test for these differences, we grouped the participants based on whether they depended on large infrastructures. We did not find significant differences between the two, and thus we treated physics as one field.

3
Candidates' Future Potential was the most important criterion in both fields. However, as shown in Fig. 1, we found differences between the fields. For instance, physicists more often identified Matching Field, General Impression, and Important Research Contribution as highly important, while economists more frequently valued Publication Numbers as highly important.
These academic field differences were confirmed by binary logistic regression analysis using the nine most common evaluative criteria as the dependent variables. The dot-andwhisker plot in Fig. 2 displays the field coefficients as part of the physics field-as opposed to economics-with standard errors (see Table 1.1 in the Appendix). The results indicated that being an economist instead of a physicist increased the probability of identifying Publication Numbers as a highly important evaluative criterion from 55 to 69% and decreased the probability of identifying Research Contribution as a highly important criterion from 74 to 59%. 3 There were, however, no significant differences in how respondents from either field valued Future Potential, Citation Numbers, or Teaching Experience.
In the analyses, we only detected moderate country differences. Figure 3 shows the percentage of respondents who identified the nine most important criteria as highly important by country.
The binary logistic regression analysis further confirmed the moderate country differences (see Table 1.1 in the Appendix), and the ANOVA tests showed that there were no country differences for the criteria Citations, Future Potential, Grants, and Research

3
Collaboration. Moreover, the country effects were mainly due to the Dutch and, to some extent, the Norwegian respondents' answers. The Dutch respondents less frequently valued Publication Numbers as highly important criteria and more often valued Language Skills, General Impression, and Teaching Experience as highly important compared with many of their international colleagues. 4 While there was only a 55% possibility for Dutch researchers to identify Publication Numbers as highly important, the probability for researchers from other countries was between 66 to 75%. 5 Norwegian respondents also showed some deviant answers valuing Matching Field higher and General Impression lower than many of their international colleagues. 6 The dot-and-whiskers plot in Fig. 4 displays the country coefficients with the Netherlands as the dotted baseline category.
We further checked for interaction effects between country and field by conducting separate regression analyses for physicists and economists (in the Appendix, see Table 3.1 for economics and Table 4.1 for physics). These analyses confirmed that there were only moderate country differences in both fields and that the country differences were similar (e.g., both the Dutch economists and physicists were less inclined to identify Publication Numbers as highly important criterion). At the same time, we found that the country variations played out differently in the two fields. For instance, Dutch physicists more often identified Important Research Contribution as a highly important evaluative criterion than their Norwegian colleagues, while there was no such significant difference between Dutch and Norwegian economists.
We also detected significant differences in applied evaluative criteria in relation to the type of position for which the respondents were recruiting. When recruiting seniors, the respondents were more inclined to emphasize Important Research Contribution, Publication Numbers, Citations Numbers, Grants, or Teaching Experience, whereas Future Potential, Matching Field, General Impression, and Language Skills were more important when recruiting juniors (see Table 1.1 in the Appendix). These effects had a further and quite substantial impact on the probability of valuing the different evaluative criteria. For example, recruiting to a senior position instead of a junior position increased the possibility of highlighting Publication Numbers from 39 to 69% and similarly raised the likelihood of highlighting Teaching Experience from 18 to 39%. 7 We also controlled for the respondents' background variables, such as age, gender, and their positions (in the Appendix, see Tables 1.1-1.8). In these analyses, we observed that respondents over 40 years were more inclined to identify Publication Numbers as the most important evaluative criterion, and professors were more inclined to emphasize Future Potential, Important Research Contribution, Publication Numbers, Grant, and Teaching Experience than associate or assistant professors. However, the effects of these background variables were relatively small and did not alter the country or field effects.
Respondents were also asked to identify the single most important criterion of those criteria selected as "Highly Emphasized." Despite the plethora of important criteria, only a 7 The effects were computed for male economists between 50 and 59 years in the Netherlands. 4 The logistic regression models with the different countries as baseline categories showed that Dutch respondents significantly valued Publication Numbers less than their Norwegian, Danish, and British colleagues. Moreover, they valued Language Skills more than Swedish, Danish, and Norwegian respondents, General Impression more than Norwegian, Swedish, and British respondents, and Teaching Experience more than Swedish and Norwegian respondents. 5 The effects were computed for male researchers between 50-59 years old in physics recruiting to senior positions. 6 The logistic regression models with different countries as baseline categories showed that the Norwegian respondents significantly valued Matching Field more than British and Swedish respondents and General Impression less than Danish, Swedish, and Dutch respondents.
1 3 few were identified as the most important, with 93% of respondents selecting either Future Potential, Matching Field, Important Research Contribution, Publication Numbers, or General Impression (see Fig. 5). Being important did not imply that the criterion was the single most important. For instance, although 44% of respondents identified Language and Communication Skills as a highly important, less than 1% identified it as the most important. Furthermore, it is noteworthy that four of the five most important criteria reflected  Fig. 4 (1/2) Dot-and-whisker plots from regression analysis. Evaluative criterias. Country differences. Netherlands as baseline category. Coefficient from regression in Appendix  Publication Numbers). Hence, although candidates were evaluated on their teaching experience, language skills, grants experience, and third mission experience, their research performance was ultimately the most important. The binary logistic regression analysis using the five most important evaluative criteria as dependent variables further confirmed the strong field differences and moderate country differences shown above (see Table 2.1 in the Appendix). For instance, being a physicist rather than an economist increased the possibility of identifying Matching Field as the most important criterion from 6 to 11% but decreased the possibility of identifying Publication Numbers as the most important criterion from 13 to 2%. 8 Conversely, being an economist rather than a physicist increased the likelihood of identifying Publication Numbers as the most important criterion from 4 to 22%. 9 However, the country effects were rather moderate. The ANOVA test showed that the country only contributed to the variance of Matching Field with significant explanations, whereas Norwegian respondents preferred Matching Field more than Swedish, Danish, and Dutch respondents. The separate regression analysis for the two fields (see Tables 3.2 and 4.2 in the Appendix) further confirmed the moderate country differences within the two fields, and to some extent, it revealed that there could be slightly greater significant differences in physics than in economics. However, there were more physicists than economists in the sample, so this result could be due to the larger number of available observations.  8 The effects were computed for male researchers between 50 and 59 years old recruiting for a senior position in the Netherlands. 9 The effects were computed for Swedes recruiting to a senior position.

3
The regression analysis moreover confirmed that the evaluative criteria depended strongly on the type of position for which respondents were recruiting, with senior positions relying more strongly on Research Contribution and Publication Numbers, while Future Potential, Matching Field, and the General Impression were more frequently preferred in junior recruitment. Additionally, we controlled for age, position, and gender differences (see Tables 2.6-2.8 in the Appendix), which turned out to have relatively small effects and did not alter the country or field differences.

Discussion
Initially, we suggested two somewhat contradictory expectations based on different strands of the literature. Our first expectation suggested that national differences exist in evaluative criteria preferences due to different national contexts (Clark 1983) and path dependencies (Thelen 1999). The second suggested that increasing internationalization of disciplines would lead to isomorphism and a prominence of field-dependent preferences (Lamont 2009). In line with the second expectation and based on prior studies, we expected that economists would more strongly value candidates' bibliometrics (Hylmö 2018), while physicists would place more value on their research profiles due to a larger degree of functional dependency among researchers (Reymert 2020;Whitley 2000). In line with our first expectation, we expected researchers in the Netherlands to be least concerned with bibliometrics because their funding model does not include bibliometric indicators (Aagaard 2015;Sivertsen 2017). Our results support both expectations, although the field differences were stronger than country differences.
We found moderate country differences, but considering our case selection, those that we found were especially relevant. The most striking country difference was that Dutch scholars placed a lower emphasis on publication, which may be explained by the fact that bibliometric indicators are not included in the Dutch performance-based funding system, and thus, there is less of an incentive to assure certain publication patterns in newly hired staff (Sivertsen 2017). This finding may suggest that the indicators in the performancebased research system have trickle-down effects on recruitment (Aagaard 2015). Moreover, while the Netherlands was a frontrunner in establishing teaching programs in many disciplines using English as language of instruction, recent policy debates have increasingly highlighted the importance of the Dutch language again, which could also explain the importance of language skills in Dutch responses (Duarte and van der Ploeg 2019). These country differences give some support to our first expectation of national embeddedness, especially considering that our field selection had a slight bias toward internationalized fields in which country differences should be less likely. However, the national differences were rather moderate, and we primarily discovered national similarities, as Musselin (2010) also observed, giving stronger support to our second expectation and the internationalization perspective (DiMaggio and Powell 1983;Ramirez 2006).
In addition, we found strong field differences, which supported our second expectation that evaluative criteria were field-specific (Musselin 2010;Van den Brink and Benschop 2011). For example, our results showed that economists assessed the candidates on their publication records, while physicists relied more on their important research contributions and the relevance of their research profiles. These results aligned with prior studies on academic recruitment processes, which have shown that economists emphasize publications in top journals (Hylmö 2018). The physicists put stronger emphasis on candidates' matching 1 3 research profiles and language skills aligned with prior studies (Reymert 2020); moreover, their emphasis on general impression could be understood in terms of how they work in research groups where individual researchers have a specific role (Whitley 2000), thereby emphasizing a more pressing need to select candidates with compatible profiles and competencies. These field differences were similar across national borders, indicating that despite national differences in recruitment (Alfonso 2016), the evaluation processes are embedded in their fields' epistemic traditions (Lamont 2009), which are less sensitive to national considerations (Musselin 2010). However, as Musselin (2010), we found moderate differences in criteria preferences across countries, indicating that evaluation processes are to some extent affected by the national context.
Our results thus show that the particular evaluative cultures in recruitment were primarily embedded in the fields and, to some extent, national contexts. This finding may imply that we should regard the international academic labor market as layered and multiple rather than singular. In it, processes are nationally regulated (Alfonso 2016), but the evaluation processes are particularly tied to different internationally oriented fields, with their evaluative cultures deeply embedded in their epistemic traditions and academic work (Lamont 2009;Välimaa 1998).
Finding partial support for both of our somewhat contradictory expectations opens the question of what kind of mechanism could be driving this development. In line with Christensen et al. (2014), one could argue that disciplines provide global norms regarding the preferences of evaluative criteria and that these global norms are then filtered when they are applied in a specific national context. In this understanding, disciplines are the main normative framework for academics, while national frameworks such as laws or funding systems mediate the application of these norms. Similar mechanisms that combine global and national factors have already been identified, for example, regarding questions of internationalization (Buckner 2019;Buckner 2020).
We further observed that despite the preference of multiple evaluative criteria, only a few were identified as most important, and these reflected primarily the candidates' research output, which other studies have also shown as the most decisive criteria in recruitment (Van den Brink and Benschop 2011). This criteria concentration could be understood by the stagewise nature of recruitment processes in which candidates are first met with formal standards or screened using their CVs before an expert committee undertakes a more thorough evaluation of their research (Hamann 2019;Musselin 2010). Additionally, we found that the evaluation of candidates for different types of positions required different evaluative criteria, which suggests that further studies are needed to identify how different evaluative criteria are used for different positions.

Conclusion
Academic recruitment and academics' emphasis on evaluative criteria in such processes are crucial for universities, as recruitment represents the basis for acquiring their key resource, namely talented academics. Over the past few decades, academic career structures and the academic labor market have become increasingly internationalized, with a growing number of international researchers and an increase in universities competing for the best scholars (Gornitzka and Langfeldt 2008). Still, local academic labor markets are embedded in national higher education systems and matching legal frameworks (Musselin 2005). Academics' preferences for different evaluative criteria in recruitment processes can 1 3 be seen as a key indicator of the degree of internationalization of an academic labor market. However, most studies on hiring processes have hitherto focused on a single country, and comparative studies are lacking; therefore, the question of whether fields apply similar or different evaluative criteria when evaluating candidates for academic positions has gone unanswered (e.g., Hylmö 2018;Levander et al. 2019;Van den Brink and Benschop 2011). This paper has targeted this knowledge gap.
In this study, we mainly observed field differences, but we also found moderate national differences. For instance, economists valued applicants' publication numbers higher than physicists, who emphasized candidates' research contributions. These findings align with prior studies addressing the fields' different evaluative cultures (Lamont 2009) and support the enduring importance of norms and values stemming from the research field's definition of peer recognition and prestige (Clark 1983;Driori et al. 2003;Hessels et al. 2019;Langfeldt et al. 2019). They further underline the differences in research practices and collaboration patterns in different research fields, and that these have to be taken into account in studies of recruitment processes. While other studies have claimed that the evaluation processes seem less sensitive to national considerations (Musselin 2010), our study found moderate but important country differences. It shows that having a bibliometric indicator in the performance-based research funding system seems to affect the emphasis on scientific publication when academics evaluate candidates for positions. Despite this, the findings first and foremost underline the internationalization of recruitment processes, showing that evaluative cultures are strongly embedded in internationally oriented research fields and their evaluative cultures, while being affected to a more limited degree by national frameworks. We argue that the mechanism behind this development is that disciplines provide global norms regarding the preferences of evaluative criteria and that these global norms are then mediated by national structures such as funding systems when they are applied in a specific context. These findings may imply that we should regard the international academic labor market not as singular but as layered and affected by multiple considerations.
We also note that our study only offers insights into one element of hiring processes, namely academics' preferred evaluative criteria. The ranking of candidates and their ultimate selection may be strategic (Musselin 2010), and other factors, such as academic inbreeding, or informal factors in candidate selection were not included in this study. The moderate country differences found in this study should also be seen in relation to the selection of relatively similar countries in North-Western Europe; studying countries in different corners of the world would probably have rendered stronger results. Moreover, only cross-sectional data were available and given the steady increase in the internationalization of higher education, tracing developments not only across countries but also over time would be highly desirable. In addition, the inclusion of more countries as well as a greater variety of fields would be an important avenue for future research. Finally, using other methods, such as interviews or participant observation, to include non-formal aspects of hiring processes would also help in the development of a more complete understanding of recruitment processes. However, the strength of this paper is that our approach enabled us to contribute with a systematic overview and comparison of evaluative preferences in different countries and fields.
Our findings have some policy implications. First, there seems to be a general understanding that bibliometric indicators have a dominant role in the evaluation of candidates. Our findings call for a more nuanced picture by accounting for field characteristics and national contexts. Moreover, removing bibliometric indicators from the performance-based research funding system seems to influence the preferences of such indicators in hiring processes. Furthermore, such indicators seem to be primarily preferred in fields such as 1 3 economics, which is often characterized as exhibiting a rather deviant and extreme behavior compared with other disciplines (Hammarfelt and Rushforth 2017;Hylmö 2018). Second, it seems that teaching and third mission activities are not seen as important when assessing candidates for positions, a finding that has also been put forward in previous studies (Levander et al. 2019). This has implications for the development of policies that emphasize the link between research and education and the importance of third mission activities, and it should be a subject of future study.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.