Skip to main content
Log in

Pennants for Garfield: bibliometrics and document retrieval

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Eugene Garfield’s name, like that of any prolific author, can designate both an oeuvre and a person. That duality is explored here with pennant diagrams, a decade-old technique that can structure information about both oeuvres and persons in one scatterplot. Such diagrams are not readily made now, but may have a place in recommender systems of the future. This paper recapitulates the basics of creating and understanding them. In pennants, every term in a bibliometric distribution is weighted with a version of the TF * IDF formula from information retrieval. The distributions are generated by a seed term, such as a cited author’s name or a subject phrase, and consist of terms that co-occur with the seed in a database. TF * IDF orders the terms by relevance and specificity with respect to the seed—an outcome interpretable in light of relevance theory from linguistic pragmatics. Garfield’s name appears illustratively as a seed in one pennant and as a co-cited author in five others. Another example shows works by him and others that co-occur with the phrase “Citation Analysis” in Scisearch. Pennants are richly suggestive about authors, and here they are linked to a fruitful idea of Garfield’s that appeared in his first paper.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. I recently learned that CiteSeer uses TF * IDF to recommend publications that are bibliographically coupled with (not co-cited with) a seed document. The weight, briefly explained in Lawrence et al. (1999), is called CCIDF, standing for “the common citations between any pair of documents weighted by the inverse frequency of citation.” But CiteSeer simply lists the titles so retrieved as “Active Documents” and does not explain what that means.

References

  • Akbulut, M. (2016a). Atıf klasiklerinin etkisinin ve ilgililik sıralamalarının pennant diyagramları ile analizi [The analysis of the impact of citation classics and relevance rankings using pennant diagrams]. Yayımlanmamış yüksek lisans tezi, Hacettepe Üniversitesi, Ankara [Unpublished master’s thesis, Hacettepe University, Ankara]. http://www.mugeakbulut.com/yayinlar/Muge_Akbulut_YL_Tez.pdf.

  • Akbulut, M. (2016b). Extended abstract: The analysis of the impact of citation classics and relevance rankings using pennant diagrams. http://www.mugeakbulut.com/yayinlar/tez_extended_abstract.pdf.

  • Arf, C. (1941). Untersuchungen über quadratische Formen in Körpern der Charakteristik 2. Teil I. [Investigations on quadratic forms in bodies of characteristic 2. Part I.] Journal für die Reine und Angewandte Mathematik [Journal for pure and applied mathematics], 183, pp. 148–167.

  • Bates, M. J. (1989). The design of browsing and berrypicking techniques for the online search interface. Online Review, 13, 407–424.

    Article  Google Scholar 

  • Bonacich, P. (1987). Power and centrality: A family of measures. American Journal of Sociology, 92(5), 1170–1182.

    Article  Google Scholar 

  • Carevic, Z., & Mayr, P. (2014). Recommender systems using pennant diagrams in digital libraries. In Paper presented at NKOS workshop, London, September 12, 2014. https://arxiv.org/ftp/arxiv/papers/1407/1407.7276.pdf.

  • Egghe, L. (2005). Power laws in the information production process: Lotkaian informetrics. Amsterdam: Elsevier.

    Google Scholar 

  • Furner, J. (2016). Type-token theory and bibliometrics. In C. R. Sugimoto (Ed.), Theories of informetrics and scholarly communication. Berlin: Walter de Gruyter GmbH & Co KG.

    Google Scholar 

  • Garfield, E. (1955). Citation indexes for science: A new dimension in documentation through association of ideas. Science (Vol. 122, pp. 108–111). http://www.garfield.library.upenn.edu/papers/science_v122v3159p108y1955.html.

  • Garfield, E. (1972). Citation analysis as a tool in journal evaluation. Science (Vol. 178, pp. 471–479). http://www.garfield.library.upenn.edu/essays/V1p527y1962-73.pdf.

  • Garfield, E. (1979). Citation indexing: Its theory and application in science, technology, and humanities. New York: Wiley. http://www.garfield.library.upenn.edu/ci/title.pdf.

  • Garfield, E. (1997). Validation of citation analysis [Letter, with a rejoinder by MacRoberts]. Journal of the American Society for Information Science, 48(10), 962–963.

    Article  Google Scholar 

  • Garfield, E. (2013). A century of citation indexing: Keynote address. COLLNET 2011 In Proceedings (pp. 5–10). https://mafiadoc.com/collnet-2011-proceedings_59aff6e81723ddb8c56166a8.html.

  • Harter, S. P. (1992). Psychological relevance and information science. Journal of the American Society for Information Science, 43(9), 602–615.

    Article  Google Scholar 

  • Holmberg, J.H. (2012). Dynamisk kunskapsorganisation: teoretisk ansats och implementering. [Dynamic knowledge organization: Theoretical approach and implementation]. Bachelor Thesis, University of Borås/Swedish School of Library and Information Science.

  • Kuhn, T. S. (1962). The structure of scientific revelutions. Chicago: University of Chicago Press.

    Google Scholar 

  • Larsen, B. (2008). Informetrics and IR. Presentation at the Nordic Research School in Information Studies (NORSLIS), Umea, Sweden. http://itlab.dbit.dk/~blar/files/Norslis_Umea-june2008_BL2.ppt.

  • Lawrence, S., Giles, C. L., & Bollacker, K. (1999). Digital libraries and autonomous citation indexing. IEEE Computer, 32(6), 67–71.

    Article  Google Scholar 

  • Lowry, O. H., Rosebrough, N. J., Farr, A. L., & Randall, R. J. (1951). Protein measurement with the Folin phenol reagent. Journal of Biological Chemistry, 193(1), 265–275.

    Google Scholar 

  • MacRoberts, M. H., & MacRoberts, B. R. (1989). Problems of citation analysis: A critical review. Journal of the American Society for information Science, 40(5), 342–349.

    Article  Google Scholar 

  • Manning, C. D., Raghavan, P., & Schütze, H. (2008). An introduction to information retrieval. Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Manning, C. D., & Schütze, H. (1999). Foundations of statistical natural language processing. Cambridge: MIT Press.

    MATH  Google Scholar 

  • Maron, M. E., & Kuhns, J. L. (1960). On relevance, probabilistic indexing and information retrieval. Journal of the ACM, 7(3), 216–244.

    Article  Google Scholar 

  • Price, D. J. D. (1970). Citation measures of hard science, soft science, technology, and nonscience. In C. E. Nelson & D. K. Pollock (Eds.), Communication among scientists and engineers. Lexington: Heath Lexington Books.

    Google Scholar 

  • Sandstrom, P. E., & White, H. D. (2007). The impact of cultural materialism: A bibliometric analysis of the writings of Marvin Harris. In L. A. Kuznar & S. K. Sanderson (Eds.), Studying societies and cultures: Marvin Harris’s cultural materialism and its legacy (pp. 20–55). Boulder: Paradigm Publishers.

    Google Scholar 

  • Schneider, J.W., Larsen, B., & Ingwersen, P. (2007). Pennant diagrams, what is it [sic], what are the possibilities and are they useful? In Presentation at the Nordic Workshop on Bibliometrics and Research Policy, Copenhagen, September 13-14, 2007. https://pdfs.semanticscholar.org/b674/7068496b8b72a5b017281b2dce75844b1e3d.pdf.

  • Selye, H. (1946). The general adaptation syndrome and the diseases of adaptation. Journal of Clinical Endocrinology, 6(2), 117–230.

    Article  Google Scholar 

  • Sparck Jones, K. (1972). A statistical interpretation of term specificity and its application to retrieval. Journal of Documentation, 28(1), 11–21.

    Article  Google Scholar 

  • Sperber, D., & Wilson, D. (1986). Relevance: Communication and cognition. Harvard University Press.

  • Sperber, D., & Wilson, D. (1995). Relevance: Communication and cognition. 2nd edn. with postface. Blackwell.

  • Tonta, Y., & Çelik, A. E. Ö. (2013). Cahit Arf: Exploring his scientific influence using social network analysis, author co-citation maps and single publication h index. Journal of Scientometric Research, 2(1), pp. 37–51. http://www.jscires.org/article/38.

  • White, H. D. (2000). Toward ego-centered citation analysis. In B. Cronin & H. B. Atkins (Eds.), The web of knowledge: A festschrift in honor of Eugene Garfield (pp. 475–496). Medford Township: Information Today.

    Google Scholar 

  • White, H. D. (2007a). Combining bibliometrics, information retrieval, and relevance theory, part 1: First examples of a synthesis. Journal of the Association for Information Science and Technology, 58(4), 536–559.

    Article  Google Scholar 

  • White, H. D. (2007b). Combining bibliometrics, information retrieval, and relevance theory, part 2: Some implications for information science. Journal of the Association for Information Science and Technology, 58(4), 583–605.

    Article  Google Scholar 

  • White, H.D. (2009). Pennants for Strindberg and Persson. Celebrating scholarly communication studies: A festschrift for Olle Persson at his 60th birthday, Special volume of the e-newletter of the International Society for Scientometrics and Informetrics, 5-S, 71–83. https://www.researchgate.net/publication/229861362_Pennants_for_Strindberg_and_Persson.

  • White, H. D. (2010a). Some new tests of relevance theory in information science. Scientometrics, 83(3), 653–667.

    Article  Google Scholar 

  • White, H.D. (2010b). Ingwersen’s image and identity compared. In Larsen, B., Schneider, J.W., Ångström, F.,Schlemmer, B. (Eds.), The Janus faced scholar: A festschrift in honour of Peter Ingwersen. Special volume of the e-zine of the International Society for Scientometrics and Informetrics, 6-S, pp. 219–227. http://vbn.aau.dk/files/90357690/JanusFacedScholer_Festschrift_PeterIngwersen_2010.pdf#page=222.

  • White, H. D. (2011). Relevance theory and citations. Journal of Pragmatics, 43(14), 3345–3361.

    Article  Google Scholar 

  • White, H. D. (2014). Co-cited author retrieval and relevance theory: Examples from the humanities. Scientometrics, 102(3), 2275–2299.

    Article  Google Scholar 

  • White, H. D. (2016a). Authors as persons and authors as bundles of words. In C. R. Sugimoto (Ed.), Theories of informetrics and scholarly communication: A festschrift in honor of Blaise Cronin. Berlin: Walter de Gruyter.

    Google Scholar 

  • White, H. D. (2016b). Bibliometrics, librarians, and bibliograms. Education for Information, 32(2), 125–148.

    Article  Google Scholar 

  • White, H. D. (2017a). Bag of works retrieval: TF * IDF weighting of works co-cited with a seed. International Journal on Digital Libraries pp. 1–11. https://link.springer.com/article/10.1007/s00799-017-0217-7.

  • White, H. D. (2017b). Relevance theory and distributions of judgments in document retrieval. Information Processing and Management, 53(5), 1080–1102.

    Article  Google Scholar 

  • White, H.D., & Mayr, P. (2013). Pennants for descriptors. In Paper presented at the NKOS Workshop, Valletta, Malta, September 26, 2013. https://arxiv.org/abs/1310.3808.

  • Wilson, C. S. (1999). Informetrics. Annual Review of Information Science and Technology, 34, 107–247.

    Google Scholar 

  • Yongxia, L., Liu, Z., & Chen, C.M. (2009). Eugene Garfield’s contributions to the formation and development of citation analysis—A visual analysis of Eugene Garfield’s publications in celebration of his 84th birthday. In Proceedings of the Fifth International Conference on WIS & Tenth COLLNET Meeting, Dalian, China, September 13-16, 2009. CD-ROM.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Howard D. White.

Appendix

Appendix

See Table 3.

Table 3 Papers with examples that bear on pennant analysis. As stated earlier, they do not contain pennants but explicitly or implicitly relate relevance theory to bibliometric distributions, adding examples of relative ease of processing. The first two involve terms weighted by TF * IDF; the remainder involve terms that are unweighted

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

White, H.D. Pennants for Garfield: bibliometrics and document retrieval. Scientometrics 114, 757–778 (2018). https://doi.org/10.1007/s11192-017-2610-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-017-2610-9

Keywords

Navigation