The book is structured in five parts. The first part presents the outset of the topic. On one hand, it describes the circumstances in which this book has been written, that is, the environment in which this project has been funded, and a description of the situation in which the humanities are concerning their competition with other subjects for funding at universities and funding institutions. On the other hand, it also comprises empirical studies on how peer review functions in the humanities as well as on the humanities scholars’ notions of quality. The second part presents the current state of quality-based publication rankings and publication databases. It focuses on projects that have their roots in the humanities and are led by a humanities scholar or focus specifically on the peculiarities of humanities research. The third part raises a delicate issue: bibliometrics in the humanities. It focuses on the problems in the application of bibliometric methods on humanities research as well as on the potential bibliometric analyses might bring if applied the right way. The fourth part focuses on the ex-ante evaluation of humanities research in practice, presenting humanities-specific evaluation procedures. The fifth part focuses on one influential ex-post practice of research evaluation that has been completely redesigned to match the needs of humanities research: The research rating of the subjects Anglistik and Amerikanistik by the German Council of Science and Humanities.
The first part starts with a contribution by Loprieno, Werlen, Hasgall and Bregy from the Rectors’ Conference of the Swiss Universities (CRUS, since 1 January 2015 called swissuniversities). They present the environment in which this volume was put together. It is a speciality of the humanities to understand the historicity of all knowledge, hence it is wise to start a volume on research assessment in the humanities presenting and reflecting on the context in which this volume has been created. Loprieno et al. present how the Swiss universities cope with the difficulty of evaluating humanities research. Their approach is scientific in nature: Following a case study in which the use of bibliometric methods in research assessment procedures for the humanities and social sciences was evaluated and found to be at least difficult if possible at all (CRUS 2009), a project was established that would scientifically investigate alternative instruments and approaches that measure aspects that cannot be captured by conventional bibliometry. The follow-up programme, drawing on the results of the first project, takes a step further and drops the concept of ‘measurement’ in favor of ‘visibility’.
The second chapter, by Wiljan van den Akker, takes the perspective from an established humanities scholar with many experiences in leading positions in university and research administration, as director of a research institute, as dean and as Director of Research at the Royal Academy of Sciences (KNAW) in the Netherlands. He argues that the humanities have to organize themselves to be able to play a role in science policy alongside the well-organized natural sciences. Hence, the humanities should also develop a system by which its research can be assessed. However, the humanities scholars should take the steering wheel in developing such a system to prevent being assessed by systems that are not suited to the nature of humanities research.
The contribution of Lamont and Guetzkow delves into how humanities and social sciences scholars assess research in expert peer review panels. They show the differences and commonalities between some humanities and social sciences disciplines in how research is evaluated by investigating informal rules, the impact of evaluation systems on such rules and definitions of originality. They show that cognitive aspects of evaluation cannot be separated from non-cognitive aspects and describe the evaluation process (by peer review) as interactional, emotional and cognitive. Peers mobilize their self-concept as well as their expertise in evaluation. Since there are different interpretations of criteria not only by peers but also by discipline, more emphasis must be put on the effect of panel composition in evaluations.
Ochsner, Hug and Daniel investigate how humanities scholars understand research quality. They take a bottom-up perspective and present quality criteria for research based on a survey administered to all scholars holding a PhD degree in three disciplines at the Swiss and the LERU universities. A broad range of quality criteria, they conclude, must be taken into account if humanities research is to be assessed appropriately. They also show that a vast majority of humanities scholars reject a purely quantitative approach to evaluation.
The first part thus provides information on the framework in which this volume has been put together and points to the ‘Swiss way to quality’, i.e. a scientific approach towards research evaluation. It furthermore puts forward reasons why the humanities disciplines should take their evaluation into their own hands. Finally, it provides empirical evidence on how evaluation by experts works and contrasts it with the view on research quality by humanities scholars from the grass-roots perspective.
The second part of the book focuses on publication rankings and publication databases. Publications lie at the heart of scientific work. Therefore, publications are often used in research evaluations, be it simply by counting the number of publications of a unit or by the use of complex rankings of publication channels.
The chapter by Gerhard Lauer opens this part of the book. He reports on the initiative of several national research funders to establish a publication database for the social sciences and humanities (SSH). He describes the problems and opposition experienced with the ERIH project, lists the requirements for a comprehensive (open) publication database that can be useful to the SSH and depicts the future of ERIH.
Gunnar Sivertsen presents such a publication database on the national level, the so-called Norwegian Model. It serves as the foundation of a publication-based performance indicator applied in Norway that distributes extra funding for research in a competitive way. Evaluations of the model show that a comprehensive publication database can be useful not only for research administrators but also for the humanities scholars themselves: It makes visible humanities research and shows that humanities scholars are conducting at least as much research as scholars from the natural and life sciences. Additionally, it can also serve information retrieval purposes for humanities scholars.
Often, publications are not just counted but also weighted according to their academic value. This is an intricate task. Elea Giménez Toledo presents how SSH journals and books are evaluated in Spain using quality criteria for publication channels. She also shows how journal and book publisher lists are used in evaluations.
The contribution by Ingrid Gogolin, finally, summarizes the European Educational Research Quality Indicators (EERQI) project. This project was initiated as a reaction against the rising relevance of numerous university rankings and citation-based indicators that are not adequately reflecting the publication practices of (European) SSH research. The aim of EERQI is to combine different evaluation methods and indicators to facilitate review practices as well as enhance the transparency of evaluation processes.
Summarizing the second part of the book, there is a lack of information about SSH publications. Establishing a database for SSH publications can lead to more visibility of SSH research, which can serve scholars in terms of information retrieval. At the same time, it may also serve administrators for evaluation purposes. Thus, creating publication databases should go hand in hand with the development of standards regarding how to use or not use publication databases in SSH research evaluation.
One of the most commonly used instruments based on publication databases to evaluate research in the natural and life sciences are bibliometric indicators. The third part of the book investigates the limitations and potential of bibliometric instruments when applied to the humanities. The third part starts with the contribution by Björn Hammarfelt. He describes the state-of-the-art of bibliometrics in the humanities and sketches a ‘bibliometrics for the humanities’ that is based upon humanities’ publication practices. He argues that while it is true that conventional bibliometrics cannot readily be used for the assessment of humanities research, bibliometrics might nevertheless be used to complement peer review if the bibliometric methods are adapted to the social and intellectual organization of the humanities.
In the second chapter of this part, Remigius Bunia, a German literature scholar, critically investigates why bibliometrics cannot be applied in the humanities with the example of German literature studies. While Bunia acknowledges that a part of the problem is due to technical and coverage issues of the commercial citation databases, he argues that there might also be a problem involved that is intrinsic to the field of literature studies: the fact that literature scholars seem not to read the works of other literature scholars or at least not to use (or cite) them in their own work. To test this claim, Bunia advocates applying bibliometrics to study what and how literary scholars cite and to critically examine the citation behaviour of literary scholars. Until light is shed on this issue, a bibliometric assessment of research performance in the humanities is not possible.
Thus, the third part of this volume shows that bibliometrics cannot be readily used to evaluate humanities research. Yet, bibliometrics adapted to the humanities can serve as tools to study publication and citation habits and patterns as well as to complement peer review. Knowing more about publication and citation habits also makes it possible to broach delicate issues in research practices.
Even though bibliometric assessment is not (yet) possible in the humanities, humanities research is assessed on a regular basis. Part four of this volume presents procedures regarding how humanities research is evaluated in practice and approaches regarding how an assessment of humanities research might look. In the focus of part four are ex-ante evaluations, i.e. evaluations of research yet to be conducted. Thomas König shares insights into the evaluation practices at the European Research Council (ERC). There was not much funding of SSH research on the European level until 2007. According to König, this is not only due to the reluctance of politicians to fund SSH in general but also because of the fact that (a) humanities researchers do not ask for funding as frequently as natural scientists and (b) SSH scholars are much less formally organized and thus cannot lobby as effectively on the political scene as natural scientists. However, the SSH’s share of funding for ERC grants is considerably higher than for the whole FP7—and rising. The distribution of applications for grants shows that there are differences between SSH disciplines in asking for funding. The results also show that despite some fears of disadvantages in interdisciplinary panels, SSH disciplines reach similar acceptance rates as the natural sciences in ERC grants.
For the next chapter we change to a private funding institution. Wilhelm Krull and Antje Tepperwien report how humanities research is evaluated in the Volkswagen Foundation, one of the largest private research funding institutions in Europe. In order to prevent falling into pitfalls by quantitative indicators not adapted to the characteristics of humanities research, they suggest guiding the evaluation of humanities research according to four ‘I’s’: infrastructure, innovation, interdisciplinarity and internationality. They also reveal important insights about evaluation practice in the humanities: Humanities reviewers even criticize proposals that they rate as excellent, a fact which can lead to disadvantages in interdisciplinary panels, as reviewers from the natural sciences do not understand why something might be very good even though it can be criticized.
The third chapter in this part presents evaluation procedures in France. After explaining the evaluation practices of the key actors in France—AERES, ANR, CNU and CNRS—Geoffrey Williams and Ioana Galleron describe two ongoing projects that aim at understanding the characteristics of French humanities research. The first project, DisValHum, aims at understanding the dissemination practices of French humanities scholars. The second, IMPRESHS, strives to bring about a better understanding of the variety of impacts humanities research can have.
The fourth part thus shows that humanities scholars do not apply for external funding as much as could be possible. Furthermore, humanities scholars are not organized well enough to lobby for humanities research on the national as well as the international level. Additionally, humanities research can be disadvantaged in interdisciplinary panels in ex-ante evaluations because of the fact that humanities scholars also criticize work they consider excellent, whereas natural scientists feel that no work should be funded that can be criticized.
The last part of the book is dedicated to a specific ex-post evaluation procedure that has been adapted for the humanities recently: the research rating of the German Council of Science and Humanities. The contribution by Christian Mair briefly describes the history of, and ideas behind, the research rating. He argues that the failure of the first attempt to conduct a pilot study for the research rating in the humanities was mainly a communication problem. He then describes the process of fleshing out a rating procedure adapted to the humanities by an expert group of humanities scholars that resulted in a pilot study of the research rating in Anglistik/Amerikanistik.
The joint contribution by Klaus Stierstorfer and Peter Schneck gives insight into the arguments for and against participating in such a rating exercise by the presidents of the two associations involved, the Deutsche Anglistenverband (German Association for English Studies) and Deutsche Gesellschaft für Amerikastudien (German Association for American Studies). Stierstorfer, then-president of the Deutsche Anglistenverband, argues that while research ratings as such are not naturally in the interest of humanities scholars but are likely to be here to stay, there might nevertheless accrue some collateral benefits. Hence, the rating has to be optimized to maximize such benefits. Peter Schneck, president of the Deutsche Gesellschaft für Amerikastudien from 2008 to 2011, also takes a very critical stance on the usefulness of research ratings. He acknowledges, however, that rating is an integral part of academic life, also in the humanities (e.g. grading students as well as rating applicants for a professorship). Therefore, he argues, the humanities should actively get involved in the discussion about standards for research assessments rather than boycott them.
The research rating Anglistik/Amerikanistik was finished in 2012. The third contribution of this part presents experiences from this pilot study from the perspective of the involved staff at the Council and members of the review board: It starts with the conclusions drawn from this pilot by the German Council of Science and Humanities. It describes the exact procedure of the research rating of Anglistik/Amerikanistik and concludes that the research rating is suitable for taking into account the specifics of the humanities research practice in the context of research assessments. The contribution continues with the perspective of Alfred Hornung, who chaired the review board of the rating as an Amerikanistik-scholar. He describes the critiques and concerns that accompanied the rating as well as the benefits of the exercise. Barbara Korte concludes this contribution with her insights into the pilot study as a member of the review board and as an Anglistik-scholar. She illustrates the difficulties of defining subdisciplines within a broad field. She warns that while the research rating helped to show the diversity of English studies, it also might have aroused more thoughts about divisions than about common research interests.
Finally, the contribution by Ingo Plag presents an empirical analysis of the ratings done during the research rating Anglistik/Amerikanistik. His analysis shows that there is a quite low variability in the ratings across raters, pointing to a high reliability of the research rating. Most criteria correlate highly with each other. However, third-party funding proves not to be a good indicator of research quality.