Skip to main content
Log in

Co-cited author retrieval and relevance theory: examples from the humanities

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Given a user-selected seed author, a unique experimental system called AuthorWeb can return the 24 authors most frequently co-cited with the seed in a 10-year segment of the Arts and Humanities Citation Index. The Web-based system can then instantly display the seed and the others as a Pathfinder network, a Kohonen self-organizing map, or a pennant diagram. Each display gives a somewhat different overview of the literature cited with the seed in a specialty (e.g., Thomas Mann studies). Each is also a live interface for retrieving (1) the documents that co-cite the seed with another user-selected author, and (2) the works by the seed and the other author that are co-cited. This article describes the Pathfinder and Kohonen maps, but focuses much more on AuthorWeb pennant diagrams, exhibited here for the first time. Pennants are interesting because they unite ego-centered co-citation data from bibliometrics, the TF*IDF formula from information retrieval, and Sperber and Wilson’s relevance theory (RT) from linguistic pragmatics. RT provides a cognitive interpretation of TF*IDF weighting. By making people’s inferential processes a primary concern, RT also yields insights into both topical and non-topical relevance, central matters in information science. Pennants for several authors in the humanities demonstrate these insights.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. Dialog, the “Cadillac” of bibliographic information services since the 1960s, was purchased by ProQuest in 2006. In 2013 ProQuest took the decades-old DialogClassic software out of service and terminated access through it to the Thomson Reuters citation databases.

  2. More than 25 nodes can be mapped in extensions of AuthorWeb software to medical databases in the Visual Concept Explorer (Zhu et al. 2005; Lin et al. 2007), but such additions arguably lead to information overload.

  3. Which expands to: Freud, Sigmund. (1921). Massenpsychologie und Ich-Analyse. Wien: Internationaler Psychoanalytischer Verlag. [Group Psychology and the Analysis of the Ego. Vienna: International Psychoanalytic Press.] The commas at the end of strings indicate truncations.

  4. In White (2007a, b, 2009, 2010a) I divided the pennants into sectors on the basis of my own qualitative judgments. AuthorWeb pennants are simply divided into thirds mechanically, and so their qualitative implications are even more approximate.

  5. Harter et al. (1993) sampled pairs of citing-cited documents (not co-cited documents) and analyzed their subject indexing for presumed semantic closeness. They found that such closeness could not be taken for granted, because the descriptors assigned to the pairs rarely matched exactly. But perfect descriptor match is a highly restrictive standard. It misses partial matches, leaves out terms from titles and abstracts altogether, and ignores semantic ties that occur in body text, rather than at the level of the entire works. More generally, it discounts the human ability to infer semantic relations that fall outside exact term-matching. However, since Harter (1992) had already advocated Sperber and Wilson’s subjective approach to relevance, the inadequacy of an “objective” descriptor-based approach seems to be the real point of the 1993 article.

  6. The form Bates-M appears because the same article sometimes cites Marcia or Milton both with and without their middle initial.

  7. T. S. Eliot is not completely irrelevant to Marcia Bates’s work. Information scientists occasionally quote his lines in Choruses from “The Rock”: “Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in information?” Since a few of them have also cited her in the same article, her co-citation count with Eliot is not zero (cf. Bates 2010).

References

  • Bar-Ilan, J. (2006). An ego-centric citation analysis of the works of Michael O. Rabin based on multiple citation indexes. Information Processing and Management, 42, 1553–1566.

    Article  Google Scholar 

  • Bates, M. J. (2010). Information. Encyclopedia of library and information sciences (3rd ed., pp. 2347–2360). New York: CRC Press.

    Google Scholar 

  • Buzydlowski, J. (2002). A comparison of self-organizing maps and pathfinder networks for the mapping of co-cited authors. PhD dissertation. Drexel University, Philadelphia, PA. http://faculty.cis.drexel.edu/~jbuzydlo/papers/thesis.pdf.

  • Buzydlowski, J. W., White, H. D., & Lin, X. (2003). Term co-occurrence analysis as an interface for digital libraries. Lecture Notes in Computer Science, 2539, 133–144. http://web3.holyfamily.edu/jbuz/papers/lncs.pdf.

  • Carston, R., & Powell, G. (2008). Relevance theory: New directions and developments. The Oxford handbook of philosophy of language (pp. 341–360). Oxford: Oxford University Press. [Also in Oxford handbooks online and at http://www.phon.ucl.ac.uk/home/robyn/Carston-Powell-PhilHandbook-28July05%5B2%5D.pdf.

  • Clark, B. (2013). Relevance theory. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Cronin, B., & Shaw, D. (2001). Identity-creators and image-makers: Using citation analysis and thick description to put authors in their place. Proceedings of the 8th international conference on scientometrics & informetrics, Vol. 1, 127–138.

  • Dowden, S. (1994). Irony and ethical autonomy in Wilhelm Meisters Wanderjahre. Deutsche Vierteljahrsschrift für Literaturwissenschaft und Geistesgeschichte, 68, 134–154.

    Google Scholar 

  • Eisenberg, M. B. (1988). Measuring relevance judgments. Information Processing and Management, 24, 373–389.

    Article  Google Scholar 

  • Fehn, A. C. (1988). Concepts of the masses and German drama in the Weimar Republic. Seminar: A Journal of Germanic Studies, 24, 31–57.

  • Franzen, J. (2013). The Kraus project: Essays by Karl Kraus. New York: Farrar, Straus and Giroux.

    Google Scholar 

  • Furner, J. (2004). Information studies without information. Library Trends, 52, 427–445.

    Google Scholar 

  • GESIS. (2013). Combining bibliometrics and information retrieval. Pre-conference workshop of the International Society for Scientometrics and Bibliometrics. Vienna, Austria, 2013. http://www.gesis.org/en/events/conferences/issiworkshop2013/.

  • Goatly, A. (1997). The language of metaphors. London: Routledge.

    Google Scholar 

  • Green, R. (1995). Topical relevance relationships. I. Why topic matching fails. Journal of the American Society for Information Science, 46, 646–653.

    Article  Google Scholar 

  • Grice, H. P. (1989). Studies in the way of words. Cambridge, MA: Harvard University Press.

    Google Scholar 

  • Harter, S. P. (1992). Psychological relevance and information science. Journal of the American Society for Information Science, 43, 602–615.

    Article  Google Scholar 

  • Harter, S. P., Nisonger, T. E., & Weng, A. (1993). Semantic relationships between cited and citing articles in library and information science journals. Journal of the American Society for Information Science, 44, 543–552.

    Article  Google Scholar 

  • Higashimori, I., & Wilson, D. 1996. Questions on relevance. University College London Working Papers in Linguistics, 8, 111–124. http://www.phon.ucl.ac.uk/home/PUB/WPL/96papers/higashi.pdf.

  • Hu, X., Rousseau, R., & Chen, J. (2012). Structural indicators in citation networks. Scientometrics, 91, 451–460.

    Article  Google Scholar 

  • Hu, Z., Chen, C., & Liu, Z. (2013). Where are citations located in the body of scientific articles? A study of the distributions of citation locations. Journal of Informetrics, 7, 887–896.

    Article  Google Scholar 

  • Jeong, Y. K., Song, M., & Ding, Y. (2014). Content-based author co-citation analysis. Journal of Informetrics, 8, 197–211.

    Article  Google Scholar 

  • Lepenies, W., & Harshav, B. (1988). Between social science and poetry in Germany. Poetics Today, 9, 117–143.

    Article  Google Scholar 

  • Lin, X., Bui, Y., & Zhang, D. (2007). Visualization of knowledge structure. Proceedings of the 11th International Conference of Information Visualization (IV2007) (pp. 476–481).

  • Lin, X., White, H. D., & Buzydlowski, J. (2003). Real-time author co-citation mapping for online searching. Information Processing and Management, 39, 689–706. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.83.4125&rep=rep1&type=pdf.

  • Malina, D. (2002). Breaking the frame: Metalepsis and the construction of the subject. Columbus, OH: Ohio State University Press. http://ohiostatepress.org/index.htm?/books/book%20pages/malina%20breaking.html.

  • Manning, C., Raghavan, P., & Schütze, H. (2008). An introduction to information retrieval. Cambridge: Cambridge University Press. Draft of 2009 ed.: http://nlp.stanford.edu/IR-book/pdf/irbookprint.pdf.

  • McCain, K. W. (2010). The view from Garfield’s shoulders: Tri-citation mapping of Eugene Garfield’s citation image over three successive decades. Annals of Library and Information Studies, 57, 261–270.

    Google Scholar 

  • Nakov, P. I., Schwartz, A. S., & Hearst, M. A. (2004). Citances: Citation sentences for semantic analysis of bioscience text. Proceedings, SIGIR’04 workshop on search and discovery in bioinformatics, Sheffield, UK, 2004. http://biotext.berkeley.edu/papers/citances-nlpbio04.pdf.

  • Rigby, J. (2013). E-mail on AuthorWeb maps for Anthony Milton.

  • Ritchie, A. (2008). Citation context analysis for information retrieval. Technical Report 744, Computer Laboratory, University of Cambridge. http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-744.html.

  • Ritchie, A., Teufel, S., & Robertson, S. (2008). Using terms from citations for IR: Some first results. Paper presented at the European conference on information retrieval, Glasgow, UK, 2008. http://www.cl.cam.ac.uk/~sht25/papers/ecir2008_ritchie.pdf.

  • Saracevic, T. (2007). Relevance: A review of the literature and a framework for thinking on the notion in information science. Part III: Behavior and effects of relevance. Journal of the American Society for Information Science and Technology, 58, 2126–2144.

    Article  Google Scholar 

  • Schneider, J. W., Larsen, B., & Ingwersen, P. (2007). Pennant diagrams: What is it [sic], what are the possibilities, and are they useful? Presentation at the 12th Nordic workshop in bibliometrics and research policy, Copenhagen, Denmark, 2007. http://yunus.hacettepe.edu.tr/~tonta/courses/spring2011/bby704/pennant-diagrams-2c_Peter_Ingwersen.pdf.

  • Shannon, C. E., & Weaver, W. (1949). The mathematical theory of communication. Urbana, IL: University of Illinois Press.

    MATH  Google Scholar 

  • Sivertsen, G. (2013). E-mail on AuthorWeb maps for Ludvig Holberg.

  • Sparck Jones, K. (1972). A statistical interpretation of term specificity and its application to retrieval. Journal of Documentation, 28, 11–21.

    Article  Google Scholar 

  • Sperber, D., & Wilson, D. (1995). Relevance: communication and cognition (2nd ed.). Oxford: Blackwell. [1st ed., 1986.]

  • Sperber, D., & Wilson, D. (1996). Fodor’s frame problem and relevance theory (reply to Chiappe & Kukla). Behavioral and Brain Sciences, 19, 530–532. http://cogprints.org/2029/1/frame.htm.

  • White, H. D. (2000). Toward ego-centered citation analysis. In B. Cronin & H. B. Atkins (Eds.), The web of knowledge: A Festschrift in honor of Eugene Garfield (pp. 475–496). Medford, NJ: Information Today.

  • White, H. D. (2002). Cross-textual cohesion and coherence. Paper presented at discourse architectures: Designing and visualizing computer mediated conversation, a workshop of CHI (Computer Human Interface), Minneapolis, MN, 2002. http://pliant.org/personal/Tom_Erickson/DA_White.pdf.

  • White, H. D. (2007a). Combining bibliometrics, information retrieval, and relevance theory: Part 1: First examples of a synthesis. Journal of the American Society for Information Science and Technology, 58, 536–559.

    Article  Google Scholar 

  • White, H. D. (2007b). Combining bibliometrics, information retrieval, and relevance theory: Part 2. Implications for information science. Journal of the American Society for Information Science and Technology, 58, 583–605.

    Article  Google Scholar 

  • White, H. D. (2009). Pennants for Strindberg and Persson. In Celebrating scholarly communication studies: A Festschrift for Olle Persson at his 60th birthday. Special Volume of the E-Newsletter, International Society for Scientometrics and Informetrics (pp. 71–83). http://www.issi-society.info/ollepersson60/ollepersson60.pdf.

  • White, H. D. (2010a). Some new tests of relevance theory in information science. Scientometrics, 83, 653–667.

    Article  Google Scholar 

  • White, H. D. (2010b). Relevance in theory. Encyclopedia of library and information sciences (3rd ed., pp. 4498–4511). New York: CRC Press.

    Google Scholar 

  • White, H. D. (2011). Relevance theory and citations. Journal of Pragmatics, 43, 3345–3361.

    Article  Google Scholar 

  • White, H. D., Lin, X., & Buzydlowski, J. W. (2001). The endless gallery: Visualizing authors’ citation images in the humanities. Proceedings of the American Society of Information Science and Technology, 38, 182–189.

    Google Scholar 

  • White, H. D., Lin, X., Buzydlowski, J. W., & Chen, C. (2004). User-controlled mapping of significant literatures. Proceedings of the National Academy of Sciences of the United States of America, 101(Supp. 1), 5297–5302.

  • White, H. D., & Mayr, P. (2013). Pennants for descriptors. Paper presented at Networked Knowledge Organization Systems (NKOS) workshop, Valetta, Malta, 2013. http://arxiv.org/abs/1310.3808.

  • Wilson, D. (2011). Relevance and the interpretation of literary works. University College London Working Papers in Linguistics, 23, 69–80. http://www.ucl.ac.uk/psychlangsci/research/linguistics/publications/wpl/11papers/Wilson2011.

  • Wilson, D., & Sperber, D. (2004). Relevance theory. In L. R. Horn & G. Ward (Eds.), The handbook of pragmatics. Oxford, England: Blackwell.

  • Yus, F. (2011). Cyberpragmatics: Internet-mediated communication in context. Amsterdam: John Benjamins.

    Book  Google Scholar 

  • Zhu, W., Lin, X., Hu, X., & Sokhansanj, B. A. (2005). Visualization of protein-protein interaction network for knowledge discovery. 2005 IEEE: International Conference on Granular Computing, 1, 373–377.

Download references

Acknowledgments

GESIS, the Leibniz Institute for the Social Sciences in Cologne, Germany, generously supported preliminary work on this article during summer 2013. I am grateful to several GESIS colleagues for stimulating discussions: Andreas Strotmann, Philipp Mayr, Philipp Schaer, and Maria Zens. Ideas in the article were also presented in a talk at the University of Amsterdam, and I thank Alesia Zuccala (of the University’s Center for Digital Humanities) and Andrea Scharnhorst (Royal Netherlands Academy of Arts and Sciences) for the invitation. Rens Bod, John Rigby, and Gunnar Sivertsen kindly suggested seed authors for me to map. I asked the latter two for reactions, and they provided them. I am pleased to quote from them here. I also thank my anonymous referees for instructive comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Howard D. White.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

White, H.D. Co-cited author retrieval and relevance theory: examples from the humanities. Scientometrics 102, 2275–2299 (2015). https://doi.org/10.1007/s11192-014-1483-4

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-014-1483-4

Keywords

Navigation