Scientometrics

, Volume 102, Issue 3, pp 2323–2345 | Cite as

Science models for search: a study on combining scholarly information retrieval and scientometrics

Article

Abstract

Models of science address statistical properties and mechanisms of science. From the perspective of scholarly information retrieval (IR) science models may provide some potential to improve retrieval quality when operationalized as specific search strategies or used for rankings. From the science modeling perspective, on the other hand, scholarly IR can play the role of a validation model of science models. The paper studies the applicability and usefulness of two particular science models for re-ranking search results (Bradfordizing and author centrality). The paper provides a preliminary evaluation study that demonstrates the benefits of using science model driven ranking techniques, but also how different the quality of search results can be if different conceptualizations of science are used for ranking.

Keywords

Information retrieval Scientometrics Bradfordizing Network analysis Betweenness Science models 

References

  1. Albert, R., & Barabási, A.-L. (2002). Statistical mechanics of complex networks. Reviews of Modern Physics, 74, 47–97.CrossRefMATHMathSciNetGoogle Scholar
  2. Armstrong, T. G., Moffat, A., Webber, W., & Zobel, J. (2009). Improvements that don’t add up: Ad-hoc retrieval results since 1998, . In: Conference on information and knowledge management (pp. 601–610). doi:10.1145/1645953.1646031.
  3. Barabasi, A. L., Jeong, H., Neda, Z., Ravasz, E., Schubert, A., & Vicsek, T. (2002). Evolution of the social network of scientific collaborations. Physica A, 311, 590–614.CrossRefMATHMathSciNetGoogle Scholar
  4. Bates, M. J. (1990). Where should the person stop and the information search interface start? Information Processing and Management, 26(5), 575–591.CrossRefMathSciNetGoogle Scholar
  5. Bates, M. J. (2002). Speculations on browsing, directed searching, and linking in relation to the Bradford distribution. Paper presented at the fourth international conference on conceptions of library and information science (CoLIS 4).Google Scholar
  6. Bavelas, A. (1948). A mathematical model for group structure. Applied Anthropology, 7, 16–30.Google Scholar
  7. Beaver, D. (2004). Does collaborative research have greater epistemic authority? Scientometrics, 60(3), 309–408.CrossRefGoogle Scholar
  8. Bogers, T., & van den Bosch, A. (2006). Authoritative re-ranking in fusing authorship-based subcollection search results. In: Proceedings of the sixth Belgian–Dutch information retrieval workshop (pp. 49–55). DIR-2006 (2006).Google Scholar
  9. Bookstein, A. (1990). lnformetric distributions. Part I: Unified overview. JASIS, 41(5), 368–375.CrossRefMathSciNetGoogle Scholar
  10. Borgatti, S. P., & Everett, M. (2006). A graph-theoretic perspective on centrality. Social Networks, 28, 466–484.CrossRefGoogle Scholar
  11. Börner, K., Dall’Asta, L., Ke, W., & Vespignani, A. (2005). Studying the emergine global brain: Analysing and visualizing the impact of co-authorship teams. Complexity. Special issue on understanding. Complex Systems, 10(4), 57–67.Google Scholar
  12. Börner, K., & Scharnhorst, A. (2009). Visual conceptualizations and models of science. Journal of Informetrics, 3, 161–172.CrossRefGoogle Scholar
  13. Bradford, S. C. (1934). Sources of information on specific subjects. Engineering, 137(3550), 85–86.Google Scholar
  14. Bradford, S. C. (1948). Documentation. London: Lockwood.Google Scholar
  15. Brookes, B. C. (1977). Theory of the Bradford law. Journal of Documentation, 33(3), 180–209.CrossRefGoogle Scholar
  16. Callon, M., Courtial, J. P., Turner, W. A., & Bauin, S. (1983). From translations to problematic networks: An introduction to co-word analysis. Social Science Information, 22, 191–235.CrossRefGoogle Scholar
  17. Chen, C., Chen, Y., Horowitz, M., Hou, H., Liu, Z., & Pellegrino, D. (2009). Towards an explanatory and computational theory of scientific discovery. Journal of Informetrics, 3, 191–209.CrossRefGoogle Scholar
  18. Cole, M., Liu, J., Belkin, N. J., Bierig, R., Gwizdka, J., Liu, C., Zhang, J., & Zhang, X. (2009). Usefulness as the criterion for evaluation of interactive information retrieval. In Proceedings of the third human computer information retrieval workshop, Washington, DC. Google Scholar
  19. De Haan, J. (1997). Authorship patterns in Dutch sociology. Scientometrics, 39(2), 197–208.CrossRefGoogle Scholar
  20. de Price, D. S. (1976). A general theory of bibliometric and other cumulative advantage processes. Journal of the American Society for Information Science, 27, 292–306.CrossRefGoogle Scholar
  21. Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76(5), 378–382.CrossRefGoogle Scholar
  22. Freeman, L. C. (1977). A set of measures of centrality based on betweenness. Sociometry, 40, 35–41.CrossRefGoogle Scholar
  23. Freeman, L. C. (1978/79). Centrality in social networks: Conceptual clarification. Social Networks 1,(pp. 215–239).Google Scholar
  24. Freeman, L. C. (1980). The gatekeeper, pair-dependency and structural centrality. Quality and Quantity, 14, 585–592.CrossRefGoogle Scholar
  25. Friedkin, N. E. (1991). Theoretical foundations for centrality measures. American Journal of Sociology, 96, 1478–1504.CrossRefGoogle Scholar
  26. Fuhr, N., Schaefer, A., Klas, C.-P., & Mutschke, P. (2002). Daffodil: An integrated desktop for supporting high-level search activities in federated digital libraries. In Agosti, M., Thanos, C. (Eds.), Research and advanced technology for digital libraries. 6th European Conference, EDCL 2002. Proceedings. Lecture Notes in Computer Science (Vol. 2458, pp. 597–612). Berlin, Heidelberg, New York: Springer.Google Scholar
  27. Garfield, E. (1996). The significant scientific literature appears in a small core of journals. The Scientist, 10(17), 13. Retrieved from http://www.garfield.library.upenn.edu/commentaries/tsv10(17)p13y090296.html.
  28. Goffman, W., & Warren, K. S. (1969). Dispersion of papers among journals based on a mathematical analysis of two diverse medical literatures. Nature, 221(5178), 1205–1207.CrossRefGoogle Scholar
  29. Greve, W., & Wentura, D. (1997). Wissenschaftliche Beobachtung: Eine Einführung. Weinheim: PVU/Beltz.Google Scholar
  30. Grivel, L., Mutschke, P., & Polanco, X. (1995). Thematic mapping on bibliographic databases by cluster analysis: A description of the SDOC environment with SOLIS. Knowledge Organisation, 22, 70–77.Google Scholar
  31. He, Z.-L. (2009). International collaboration does not have greater epistemic authority. JASIST, 60(10), 2151–2164.CrossRefGoogle Scholar
  32. Hjørland, B., & Nicolaisen, J. (2005). Bradford’s law of scattering: Ambiguities in the concept of “subject”. Paper presented at the 5th international conference on conceptions of library and information science.Google Scholar
  33. Ingwersen, P., & Järvelin, K. (2005). The turn-integration of information seeking and retrieval in context. In W. B. Croft (Ed.), The Kluwer international series on information retrieval. Dordrecht: Springer.Google Scholar
  34. Jiang, Y. (2008). Locating active actors in the scientific collaboration communities based on interaction topology analysis. Scientometrics, 74(3), 471–482.CrossRefGoogle Scholar
  35. Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.CrossRefMATHMathSciNetGoogle Scholar
  36. Lang, F. R., & Neyer, F. J. (2004). Kooperationsnetzwerke und karrieren an deutschen hochschulen. KfZSS, 56(3), 520–538.Google Scholar
  37. Leydesdorff, L., & Wagner, C. S. (2008). International collaboration in science and the formation of a core group. Journal of Informetrics, 2(4), 317–325.CrossRefGoogle Scholar
  38. Lin, J. (2008). Pagerank without hyperlinks: Reranking with PubMed related article networks for biomedical text retrieval. BMC Bioinformatics, 9. Retrieved from http://www.biomedcentral.com/1471-2105/9/270.
  39. Liu, X., Bollen, J., Nelson, M. L., & van de Sompel, H. (2005). Co-authorship networks in the digital library research community. Information Processing and Management, 41(2005), 1462–1480.CrossRefGoogle Scholar
  40. Lotka, A. (1926). The frequency distribution of scientific productivity. Journal of the Washington Academy of Sciences, 16(12), 317–323.Google Scholar
  41. Lu, H., & Feng, Y. (2009). A measure of authors’ centrality in co-authorship networks based on the distribution of collaborative relationships. Scientometrics, 81(2), 499–511.CrossRefMathSciNetGoogle Scholar
  42. Mali, F., Kronegger, L., Doreian, P., & Ferligoj, A. (2012). Dynamic scientific co-authorship networks. In A. Scharnhorst, et al. (Eds.), Models of science dynamics. Encounters between complexity theory and information sciences (pp. 195–232). Berlin, Heidelberg: Springer.Google Scholar
  43. Mayr, P. (2009). Re-ranking auf basis von Bradfordizing für die verteilte Suche in digitalen bibliotheken. Berlin: Humboldt-Universität zu Berlin.Google Scholar
  44. Mayr, P. (2013). Relevance distributions across Bradford zones: Can Bradfordizing improve search? In J. Gorraiz, E. Schiebel, C. Gumpenberger, M. Hörlesberger, & H. Moed (Eds.), 14th International society of scientometrics and informetrics conference (pp. 1493–1505). Vienna, Austria. Retrieved from http://arxiv.org/abs/1305.0357.
  45. Mayr, P. (2014). Are topic-specific search term, journal name and author name recommendations relevant for researchers? In 4th European symposium on human–computer interaction and information retrieval. London. Retrieved from http://arxiv.org/abs/1408.4440.
  46. Mayr, P., Scharnhorst, A., Larsen, B., Schaer, P., & Mutschke, P. (2014). Bibliometric-enhanced information retrieval. In M. de Rijke et al. (Ed.), Proceedings of the 36th European conference on information retrieval (ECIR 2014) (pp. 798–801). Amsterdam: Springer. http://arxiv.org/abs/1310.8226.
  47. Mutschke, P. (1994). Processing scientific networks in bibliographic databases. In H. H. Bock, et al. (Eds.), Information systems and data analysis. Prospects-foundations-applications. Proceedings 17th annual conference of the GfKl 1993 (pp. 127–133). Heidelberg, Berlin: Springer.Google Scholar
  48. Mutschke, P. (2001). Enhancing information retrieval in federated bibliographic data sources using author network based stratagems. In Constantopoulos, P., Sölvberg, I. T. (Eds.), Research and advanced technology for digital libraries: 5th European conference, ECDL 2001, Proceedings. Lecture notes in computer science (Vol. 2163, pp. 287–299). Berlin, Heidelberg, New York: Springer.Google Scholar
  49. Mutschke, P. (2004). Autorennetzwerke: Verfahren der Netzwerkanalyse als Mehrwertdienste für Informationssysteme. Bonn: Informationszentrum Sozialwissenschaften. (IZ-Arbeitsbericht Nr. 32). http://www.gesis.org/fileadmin/upload/forschung/publikationen/gesis_reihen/iz_arbeitsberichte/ab_32.pdf.
  50. Mutschke, P. (2010). Zentralitäts- und Prestigemaße. In R. Häußling & C. Stegbauer (Eds.), Handbuch Netzwerkforschung. Wiesbaden: VS-Verlag für Sozialwissenschaften.Google Scholar
  51. Mutschke, P., Mayr, P., Schaer, P., & Sure, Y. (2011). Science models as value-added services for scholarly information systems. Scientometrics, 89(1), 349–364. doi:10.1007/s11192-011-0430-x.CrossRefGoogle Scholar
  52. Mutschke, P., & Quan-Haase, A. (2001). Collaboration and cognitive structures in social science research fields: Towards socio-cognitive analysis in information systems. Scientometrics, 52(3), 487–502.CrossRefGoogle Scholar
  53. Newman, M. E. J. (2001). The structure of scientific collaboration networks. PNAS, 98, 404–409.CrossRefMATHGoogle Scholar
  54. Newman, M. E. J. (2004). Coauthorship networks and patterns of scientific collaboration. PNAS, 101, 5200–5205.CrossRefGoogle Scholar
  55. Pontigo, J., & Lancaster, F. W. (1986). Qualitative aspects of the Bradford distribution. Scientometrics, 9(1–2), 59–70.CrossRefGoogle Scholar
  56. Salton, G., Fox, E. A., & Wu, H. (1983). Extended Boolean information retrieval. Communications of the ACM, 26(11), 1022–1036.CrossRefMATHMathSciNetGoogle Scholar
  57. Saracevic, T. (1975). Relevance: A review of and a framework for the thinking on the notion in information science. Journal of the American Society for Information Science, 26(6), 321–343.CrossRefGoogle Scholar
  58. Schaer, P., Mayr, P., & Mutschke, P. (2010). Implications of inter-rater agreement on a student information retrieval evaluation. In Atzmüller, M., Benz, D., Hotho, A., Stumme, G. (Eds.), LWA2010: Lernen, Wissen and Adaptivität; Workshop Proceedings; Kassel, 4–6 Oct 2010. http://arxiv.org/ftp/arxiv/papers/1010/1010.1824.pdf.
  59. Scharnhorst, A., Börner, K., & van den Besselaar, P. (Eds.). (2012). Models of science dynamics. Encounters between complexity theory and information science. Berlin, Heidelberg: Springer.Google Scholar
  60. Voorhees, E. M., & Harman, D. K. (Eds.). (2005). TREC: Experiment and evaluation in information retrieval. Cambridge: The MIT Press.Google Scholar
  61. Wassermann, S., & Faust, K. (1994). Social network analysis: Methods and applications. New York: Cambridge University Press.CrossRefGoogle Scholar
  62. White, H. D. (1981). ‘Bradfordizing’ search output: how it would help online users. Online Review, 5(1), 47–54.CrossRefGoogle Scholar
  63. Worthen, D. B. (1975). The application of Bradford’s law to monographs. Journal of Documentation, 31(1), 19–25.CrossRefGoogle Scholar
  64. Yaltaghian, B., & Chignell, M. (2002). Re-ranking search results using network analysis. A case study with Google. In Proceedings of the 2002 conference of the Centre for Advanced Studies on Collaborative Research.Google Scholar
  65. Yan, E., & Ding, Y. (2009). Applying centrality measures to impact analysis: A coauthorship network analysis. JASIST, 60(10), 2107–2118.CrossRefGoogle Scholar
  66. Yin, L., Kretschmer, H., Hannemann, R. A., & Liu, Z. (2006). Connection and stratification in research collaboration: An analysis of the COLLNET network. Information Processing and Management, 42, 1599–1613.CrossRefGoogle Scholar
  67. Zhou, D., Orshansky, S. A., Zha, H., & Giles, C. L. (2007). Co-ranking authors and documents in a heterogeneous network. In Proceedings of the 2007 seventh IEEE international conference on data mining (pp 739–744).Google Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2014

Authors and Affiliations

  1. 1.GESIS–Leibniz-Institute for the Social SciencesCologneGermany

Personalised recommendations