International Journal on Digital Libraries

, Volume 19, Issue 2–3, pp 151–161 | Cite as

The references of references: a method to enrich humanities library catalogs with citation data

  • Giovanni Colavizza
  • Matteo Romanello
  • Frédéric Kaplan


The advent of large-scale citation indexes has greatly impacted the retrieval of scientific information in several domains of research. The humanities have largely remained outside of this shift, despite their increasing reliance on digital means for information seeking. Given that publications in the humanities have a longer than average life-span, mainly due to the importance of monographs for the field, this article proposes to use domain-specific reference monographs to bootstrap the enrichment of library catalogs with citation data. Reference monographs are works considered to be of particular importance in a research library setting, and likely to possess characteristic citation patterns. The article shows how to select a corpus of reference monographs, and proposes a pipeline to extract the network of publications they refer to. Results using a set of reference monographs in the domain of the history of Venice show that only 7% of extracted citations are made to publications already within the initial seed. Furthermore, the resulting citation network suggests the presence of a core set of works in the domain, cited more frequently than average.


Digital libraries Bibliometrics Citation extraction Information retrieval History of Venice 



We thank Martina Babetto and Silvia Ferronato for the digitization and annotation of the dataset. The Library of the Ca’ Foscari University of Venice willingly collaborated with bibliographical resources and logistics support. The Central Institute for the Union Catalogue of Italian Libraries and Bibliographic Information (ICCU) shared its catalog metadata with us. We thank both for their support. Finally, we thank our anonymous reviewers for their helpful comments. This project is funded by the Swiss National Fund under Division II, project number 205121_159961. Colavizza also benefits from a separate Swiss National Fund grant, number P1ELP2_168489.


  1. 1.
    Ahlgren, P., Pagin, P., Persson, O., Svedberg, M.: Bibliometric analysis of two subdomains in philosophy: free will and sorites. Scientometrics 103, 47–73 (2015)CrossRefGoogle Scholar
  2. 2.
    Ardanuy, J.: Sixty years of citation analysis studies in the humanities (1951–2010). J. Am. Soc. Inf. Sci. Technol. 64(8), 1751–1755 (2013)CrossRefGoogle Scholar
  3. 3.
    Barrett, A.: The information-seeking habits of graduate student researchers in the humanities. J. Acad. Librariansh. 31(4), 324–331 (2005)CrossRefGoogle Scholar
  4. 4.
    Buchanan, G., Cunningham, S.J., Blandford, A., Rimmer, J., Warwick, C.: Information seeking by humanities scholars. In: International Conference on Theory and Practice of Digital Libraries. Springer, pp. 218–229 (2005)Google Scholar
  5. 5.
    Councill, I.G., Giles, C.L., Kan, M.Y.: ParsCit: an open-source CRF Reference String Parsing Package. In: LREC (2008)Google Scholar
  6. 6.
    Hammarfelt, B.: Interdisciplinarity and the intellectual base of literature studies: citation analysis of highly cited monographs. Scientometrics 86(3), 705–725 (2011)CrossRefGoogle Scholar
  7. 7.
    Hammarfelt, B.: Using altmetrics for assessing research impact in the humanities. Scientometrics 101(2), 1419–1430 (2014)CrossRefGoogle Scholar
  8. 8.
    Heinzkill, R.: Characteristics of references in selected scholarly English literary journals. Libr. Q. 50(3), 352–365 (1980)Google Scholar
  9. 9.
    Heinzkill, R.: References in scholarly English and American literary journals thirty years later: a citation study. Coll. Res. Libr. 68(2), 141–154 (2007)CrossRefGoogle Scholar
  10. 10.
    Kellsey, C., Knievel, J.: Overlap between humanities faculty citation and library monograph collections, 2004–2009. Coll. Res. Libr. 73(6), 569–583 (2012)CrossRefGoogle Scholar
  11. 11.
    Kim, Y.M., Bellot, P., Faath, E., Dacos, M.: Automatic annotation of bibliographical references in digital humanities books, articles and blogs. In: Proceedings of the 4th ACM workshop on Online books, complementary social media and crowdsourcing, ACM, pp. 41–48 (2011)Google Scholar
  12. 12.
    Knievel, J.E., Kellsey, C.: Citation analysis for collection development: a comparative study of eight humanities fields. Libr. Q. Inf. Community Policy 75(2), 142–168 (2005)CrossRefGoogle Scholar
  13. 13.
    Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of ICML, pp. 282–289 (2001)Google Scholar
  14. 14.
    LindholmRomantschuk, Y., Warner, J.: The role of monographs in scholarly communication: an empirical study of philosophy, sociology and economics. J. Doc. 52(4), 389–404 (1996)CrossRefGoogle Scholar
  15. 15.
    Linmans, A.J.M.: Why with bibliometrics the Humanities does not need to be the weakest link: indicators for research evaluation based on citations, library holdings, and productivity measures. Scientometrics 83(2), 337–354 (2009)CrossRefGoogle Scholar
  16. 16.
    Lopez, P.: GROBID: combining automatic bibliographic data recognition and term extraction for scholarship publications. In: Research and Advanced Technology for Digital Libraries, Springer, pp. 473–474 (2009)Google Scholar
  17. 17.
    Marchi, M.D., Lorenzetti, E.: Measuring the impact of scholarly journals in the humanities field. Scientometrics 106(1), 253–261 (2015)CrossRefGoogle Scholar
  18. 18.
    McCain, K.W.: Citation patterns in the history of technology. Libr. Inf. Sci. Res. 9, 41–59 (1987)Google Scholar
  19. 19.
    Mingers, J., Leydesdorff, L.: A review of theory and practice in scientometrics. Eur. J. Oper. Res. 246(1), 1–19 (2015)CrossRefzbMATHGoogle Scholar
  20. 20.
    Mongeon, P., Paul-Hus, A.: The journal coverage of Web of Science and Scopus: a comparative analysis. Scientometrics 106(1), 213–228 (2015)CrossRefGoogle Scholar
  21. 21.
    Nederhof, A.J.: Bibliometric monitoring of research performance in the social sciences and the humanities: a review. Scientometrics 66(1), 81–100 (2006)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Nolen, D.S., Richardson, H.A.: The search for landmark works in English literary studies: a citation analysis. J. Acad. Libr. 42(4), 453–458 (2016)CrossRefGoogle Scholar
  23. 23.
    Okazaki N (2007) CRFsuite: a fast implementation of Conditional Random Fields (CRFs).
  24. 24.
    Romanello, M., Colavizza, G.: dhlab-epfl/LinkedBooksMonographs: LinkedBooksMonographs (version 1.0) (2017). doi: 10.5281/zenodo.266889
  25. 25.
    SBN, G.: Reicat—GuidaSBN (2016). Last Accessed 9 Jan 2017
  26. 26.
    Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: BRAT: a web-based Tool for NLP-assisted text annotation. In: Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, EACL ’12, pp. 102–107 (2012)Google Scholar
  27. 27.
    Sula, C.A., Miller, M.: Citations, contexts, and humanistic discourse: toward automatic extraction and classification. Lit. Linguist. Comput. 29(3), 452–464 (2014)CrossRefGoogle Scholar
  28. 28.
    Thelwall, M., Delgado, M.M.: Arts and humanities research evaluation: no metrics please just data. J. Doc. 71(4), 817–833 (2015)CrossRefGoogle Scholar
  29. 29.
    Thompson, J.W.: The death of the scholarly monograph in the humanities? Citation patterns in literary scholarship. Libri 52(3), 121–136 (2002)CrossRefGoogle Scholar
  30. 30.
    Waltman, L.: A review of the literature on citation impact indicators. J. Informetr. 10(2), 365–391 (2016)CrossRefGoogle Scholar
  31. 31.
    Weingart, S.B.: Finding the history and philosophy of science. Erkenntnis 80(1), 201–213 (2015)CrossRefGoogle Scholar
  32. 32.
    Wiberley, Jr S.E.: Humanities literatures and their users. In: Encyclopedia of Library and Information Sciences, pp. 2197–2204 (2010)Google Scholar
  33. 33.
    Williams, P., Stevenson, I., Nicholas, D., Watkinson, A., Rowlands, I.: The role and future of the monograph in arts and humanities research. Aslib Proc. 61(1), 67–82 (2009)CrossRefGoogle Scholar
  34. 34.
    Wu, J., Williams, K., Chen, H.H., Khabsa, M., Caragea, C., Ororbia, A., Jordan, D., Giles, C.L.: Citeseerx: Ai in a digital library search engine. In: Innovative Applications of AI Conference (2014)Google Scholar
  35. 35.
    Zordan, G.: Repertorio di storiografia veneziana: testi e studi. Il Poligrafo, Padova (1998)Google Scholar
  36. 36.
    Zuccala, A., Guns, R., Cornacchia, R., Bod, R.: Can we rank scholarly book publishers? A bibliometric experiment with the field of history. J. Assoc. Inf. Sci. Technol. 66(7), 1333–1347 (2014)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2017

Authors and Affiliations

  1. 1.Digital Humanities LaboratoryÉcole Polytechnique Fédérale de LausanneLausanneSwitzerland

Personalised recommendations