Skip to main content
Log in

Modeling uncertainty in bibliometrics and information retrieval: an information fusion approach

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

We describe ongoing research where the aim is to apply recent results from the research field of information fusion to bibliometric analysis and information retrieval. We highlight the importance of ‘uncertainty’ within information fusion and argue that this concept is crucial also for bibliometrics and information retrieval. More specifically, we elaborate on three research strategies related to uncertainty: uncertainty management methods, explanation of uncertainty and visualization of uncertainty. We exemplify our strategies to the classical problem of author name disambiguation where we show how uncertainty can be modeled explained and visualized using information fusion. We show how an information seeker can benefit from tracing increases/decreases of uncertainty in the reasoning process. We also present how such changes can be explained for the information seeker through visualization techniques, which are employed to highlight the complexity involved in the process of modeling and managing uncertainty in bibliometric analysis. Finally we argue that a further integration of information fusion approaches in the research area of bibliometrics and information retrieval may results in new and fruitful venues of research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Amin, G. R., Emrouznejad, A., & Sadeghi, H. (2012). Metasearch information fusion using linear programming. Rairo-Operations Research, 46(4), 289–303.

    Article  MATH  MathSciNet  Google Scholar 

  • Anagnostopoulos, I., Anagnostopoulos, C., Kouzas, G., & Dimitrios, V. (2004). Precise photo retrieval on the web with a fuzzy logic\neural network-based meta-search engine. In G. A. Vouros & T. Panayiotopoulos (Eds.), Proceedings of the methods and applications of artificial intelligence (Vol. 3025, pp. 43–53). Berlin: Springer.

    Chapter  Google Scholar 

  • Anagnostopoulos, I., Anagnostopoulos, C., Psoroulas, I., Loumos, V., & Kayafas, E. (2003). Information fusion meta-search interface for precise photo acquisition on the web. In ITI (Ed.), Proceedings of the 25th international conference on information technology interfaces (pp. 375–381). Zagreb: Univ Zagreb, Fac Forestry.

  • Arnborg, S. (2006). Robust Bayesianism: Relation to evidence theory. Journal of Advances in Information Fusion, 1(1), 63–74.

    Google Scholar 

  • Atoyan, H., Robert, J.-M., & Duquet, J.-M. (2008). Presentation of uncertain information in user interfaces to support decision making in complex military systems. In Proceedings of 20th international conference of the Association Francophone d’Interaction Homme-Machine (pp. 41–48).

  • Bar-Ilan, J. (2008). Informetrics at the beginning of the 21st century—A review. Journal of Informetrics, 2, 1–52.

    Article  Google Scholar 

  • Bernardo, J. M., & Smith, A. F. M. (2000). Bayesian theory. Hoboken: Wiley.

    MATH  Google Scholar 

  • Bhattacharya, I., & Getoor, L. (2007). Collective entity resolution in relational data. ACM Transactions on Knowledge Discovery from Data, 1, 1–36.

    Article  Google Scholar 

  • Bisantz, A. M., Finger, R., Seong, Y., & Llinas, L. (1999). Human performance and data fusion based decision aids. In Proceedings of the 2nd international conference on information fusion (pp. 918–925).

  • Bleiholder, J., & Naumann, F. (2008). Data fusion. ACM Computing Surveys, 41(1), 1–41.

    Article  Google Scholar 

  • Bokhari, H. U., & Hasan, F. (2013). Multimodal information retrieval: Challenges and future trends. International Journal of Computer Applications, 74(14), 9–12.

    Article  Google Scholar 

  • Börner, K., Chen, C., & Boyack, K. W. (2003). Visualizing knowledge domains. Annual review of information science and technology, 37(1), 179–255.

    Article  Google Scholar 

  • Cozman, F. G. (2000). Credal networks. Artificial Intelligence, 120(2), 199–233.

    Article  MATH  MathSciNet  Google Scholar 

  • Cozman, F. G. (2005). Graphical models for imprecise probabilities. International Journal of Approximate Reasoning, 39, 167–184.

    Article  MATH  MathSciNet  Google Scholar 

  • Dempster, A.P. (1968). A generalization of Bayesian inference. Journal of the Royal Statistical Society. Series B (Methodological), 30(2), 205–247

  • Ding, Y., Chowdhury, G. G., Foo, S., & Qian, W. (2000). Bibliometric information retrieval system (BIRS): A web search interface utilizing bibliometric research results. Journal of the American Society for Information Science, 51(13), 1190–1204.

    Article  Google Scholar 

  • Dura, E., Gawronska, B., Olsson, B., & Erlendsson, B. (2006). Towards information fusion in pathway evaluation: Encoding relations in biomedical texts. In Paper presented at the 9th international conference on information fusion (Vols. 1–4).

  • Escalante, H. J., Montes, M., & Sucar, E. (2012). Multimodal indexing based on semantic cohesion for image retrieval. Information Retrieval, 15(1), 1–32.

    Article  Google Scholar 

  • Ferreira, A. A., Goncalves, M. A., & Laender, A. H. F. (2012). A brief survey of automatic methods for author name disambiguation. ACM Sigmod Record, 41(2), 15–26.

    Article  Google Scholar 

  • Fox, E. A., & Shaw, J. (1994). Combination of multiple searches. In The second text REtrieval conference (TREC-2) (pp. 243–252).

  • Griethe, H., & Schumann, H. (2006). The visualization of uncertain data: Methods and problems. In Proceedings of the simulation and visualization (SimVis ‘06) (pp. 143–156).

  • Haenni, R. (2009). Probabilistic argumentation. Journal of Applied Logic, 7(2), 155–176.

    Article  MATH  MathSciNet  Google Scholar 

  • Harmanec, D., & Klir, G. J. (1994). Measuring total uncertainty in Dempster–Shafer theory: A novel approach. International Journal of General Systems, 22(4), 405–419.

    Article  MATH  Google Scholar 

  • Hong, W. S., Chen, S. J., Wang, L. H., & Chen, S. M. (2005). A new prioritized information fusion method for handling fuzzy information retrieval problems. In L. Wang & Y. Jin (Eds.), Proceedings of the fuzzy systems and knowledge discovery, part 1 (Vol. 3613, pp. 694–697). Berlin: Springer.

    Chapter  Google Scholar 

  • Hong, W.-S., Chen, S. J., Wang, L. H., & Chen, S.-M. (2007). Prioritized information fusion for fuzzy information retrieval. Journal of Intelligent and Fuzzy Systems, 18(4), 417–431.

    MATH  Google Scholar 

  • Huibers, T. W. C., & Bruza, P. D. (1994). Situations: A general framework for studying information retrieval. Utrecht: Department of Computer Science, Utrecht University. 1994.

    Google Scholar 

  • Jenkins, M. P., Gross, G., Bisantz, A. M., & Nagi, R. (2011). Towards context-aware hard/soft information fusion: Incorporating situationally qualified human observations into a fusion process for intelligence analysis. In IEEE first international multi-disciplinary conference on cognitive methods in situation awareness and decision support (CogSIMA) (pp. 74–81).

  • Jousselme, A. L., & Maupin, P. (2013). Comparison of uncertainty representations for missing data in information retrieval. In Proceedings of the 16th international conference on information fusion (pp. 1902–1909).

  • Kandefer, M., & Shapiro, S. C. (2011). Evaluating spreading activation for soft information fusion. In Proceedings of the 14th international conference on information fusion (pp. 1–8).

  • Karlsson, A., Johansson, R., & Andler, S. F. (2011). Characterization and empirical evaluation of Bayesian and credal combination operators. Journal of Advances in Information Fusion, 6(2), 150–166.

    Google Scholar 

  • Klir, G. J., & Yuan, B. (1995). Fuzzy sets and fuzzy logic: Theory and applications. Upper Saddle River: Prentice Hall.

    MATH  Google Scholar 

  • Lee, H. S., Chou, M. T., Tseng, W. K., Fang, H. H., & Yeh, C. H. (2007). A new information fusion method for fuzzy information retrieval. In B. Apolloni, R. J. Howlett, & L. Jain (Eds.), Proceedings of the knowledge-based intelligent information and engineering systems: KES 2007–WIRN 2007, part II (Vol. 4693, pp. 1293–1298). Berlin: Springer.

    Chapter  Google Scholar 

  • Levi, I. (1983). The enterprise of knowledge: An essay on knowledge, credal probability, and chance. Cambridge, MA: The MIT press.

    Google Scholar 

  • Li, Y., Shi, N., & Hsu, D. F. (2011) Fusion analysis of information retrieval models on biomedical collections. In Proceedings of the 14th international conference on information fusion (pp. 1–8).

  • Liggins, M. E., Hall, D. L., & Llinas, J. (Eds.). (2009). Multisensor data fusion. Boca Raton: CRC Press.

    Google Scholar 

  • Lin, X., White, H. D., & Buzydlowski, J. (2003). Real-time author co-citation mapping for online searching. Information Processing and Management, 39(5), 689–706.

    Article  MATH  Google Scholar 

  • Liu, W., Islamaj Doğan, R., Kim, S., Comeau, D. C., Kim, W., Yeganova, L., et al. (2014). Author name disambiguation for PubMed. Journal of the American Society for Information Science and Technology, 65(4), 765–781.

    Google Scholar 

  • Liu, H. Z., Wu, Z. H., & Hsu, D. F. (2012). Combination of multiple retrieval systems using rank-score function and cognitive diversity. In L. Barolli, T. Enokido, F. Xhafa, & M. Takizawa (Eds.), 26th international conference on advanced information networking and applications (pp. 167–174). New York: IEEE.

    Google Scholar 

  • Mayr, P. (2013). Relevance distributions across Bradford Zones: Can Bradfordizing improve search? In Proceedings of 14th international society of scientometrics and informetrics conference (pp. 1493–1505)

  • Mutschke, P., Mayr, P., Schaer, P., & Sure, Y. (2011). Science models as value-added services for scholarly information systems. Scientometrics, 89, 349–364.

    Article  Google Scholar 

  • Ng, K. (2000). Information fusion for spoken document retrieval. In Paper presented at the 2000 IEEE international conference on acoustics, speech, and signal processing, proceedings, New York (Vols. I–Vi).

  • Pang, A. T., Wittenbrink, C. M., & Lodha, S. K. (1997). Approaches to uncertainty visualization. The Visual Computer, 13, 370–390.

    Article  Google Scholar 

  • Pham, B., Streit, A., & Brown, R. (2009). Visualisation of information uncertainty: Progress and challenges. In R. Liere, T. Adriaansen, & E. Zudilova-Seinstra (Eds.), Trends in interactive visualization, advanced information and knowledge processing (pp. 19–48). London: Springer.

  • Qiang, Y., Valcke M., De Maeyer, P., & Van de Weghe, N. (2014) Representing time intervals in a two-dimensional space: an empirical exploratory study. Journal of Visual Languages and Computing. doi:10.1016/j.jvlc.2014.01.001.

  • Qiang, Y., Delafontaine, M., Versichele, M., De Maeyer, P., & Van de Weghe, N. (2012). Interactive analysis of time intervals in a two-dimensional space. Information Visualization, 49(3), 265–280.

    Google Scholar 

  • Quellec, G., Lamard, M., Cazuguel, G., Roux, C., & Cochener, B. (2011). Case retrieval in medical databases by fusing heterogeneous information. IEEE Transactions on Medical Imaging, 30(1), 108–118.

    Article  Google Scholar 

  • Reijnhoudt, L., Costas, R., Noyons, E., Börner, K., & Scharnhorst, A. (2014). ‘Seed + expand’: A general methodology for detecting publication oeuvres of individual researchers. Scientometrics. doi:10.1007/s11192-014-1256-0.

  • Revuelta-Martinez, A., Garcia-Varea, I., Puerta, J. M., & Rodriguez, L. (2010). ISDM at ImageCLEF 2010 fusion task. In D. Unay, Z. Cataltepe, & S. Aksoy (Eds.), Recognizing patterns in signals, speech, images, and videos (Vol. 6388, pp. 109–118). Berlin: Springer.

    Chapter  Google Scholar 

  • Riveiro, M. (2007). Evaluation of uncertainty visualization techniques for information fusion. In Proceedings of the 10th international conference on information fusion (pp. 1–8).

  • Russel, S., & Norvig, P. (2003). Artificial intelligence—A modern approach. Englewood Cliffs: Prentice Hall.

    Google Scholar 

  • Saracevic, T. (1997). Users lost: Reflections on the past, future, and limits of information science. SIGIR Forum, 31(2), 16–27.

    Article  Google Scholar 

  • Saracevic, T. (1999). Information science. Journal of the American Society for Information Science, 50(12), 1051–1063.

    Article  Google Scholar 

  • Shaban, K. B., Basir, O. A., Hassanein, K., & Kamel, M. (2002a). Information fusion in a cooperative multi-agent system for web information retrieval. Sunnyvale: International Society for Information Fusion.

    Google Scholar 

  • Shaban, K. B., Basir, O. A., Hassanein, K., & Kamel, M. (2002). Information fusion in a cooperative multi-agent system for web information retrieval. In ISIF (Ed.), Proceedings of the fifth international conference on information fusion (Vol. Ii, pp. 1256–1262). Sunnyvale: International Society for Information Fusion.

  • Shafer, G. (1976). A mathematical theory of evidence. Princeton: Princeton University Press.

    MATH  Google Scholar 

  • Shapiro, S. C., & Schlegel, D. R. (2013). Natural language understanding for soft information fusion. In Proceedings of the 16th international conference on information fusion (pp. 380–388).

  • Skeels, M., Lee, B., Smith, G., & Robertson, G. (2010). Revealing uncertainty for information visualization. Information Visualization, 9(1), 70–81.

    Article  Google Scholar 

  • Smalheiser, N. R., & Torvik, V. I. (2009). Author name disambiguation. Annual Review of Information Science and Technology, 43(1), 1–43.

    Article  Google Scholar 

  • Small, H. (1994). A SCI-MAP case study: Building a map of AIDS research. Scientometrics, 30(1), 229–241.

    Article  Google Scholar 

  • Smeaton, A. F., O’Connor, E., & Regan, F. (2013) Multimedia information retrieval and environmental monitoring: Shared perspectives on data fusion. Ecological Informatics. doi:10.1016/j.ecoinf.2013.10.004.

  • Smets, P., & Kennes, R. (1994). The transferable belief model. Artificial Intelligence, 66(2), 191–234.

    Article  MATH  MathSciNet  Google Scholar 

  • Steinberg, A. N., & Bowman, C. L. (2009). Revisions to the JDL data fusion model. In M. E. Liggins, D. L. Hall, & J. Llinas (Eds.), Handbook of multisensor data fusion (2nd ed.). Boca Raton: CRC Press.

    Google Scholar 

  • Steinhauer, H. J., & Karlsson, A. (2013). Traceable uncertainty for threat evaluation in air to ground scenarios. In Twelfth Scandinavian conference on artificial intelligence (pp. 255–264).

  • Steinhauer, H. J., Karlsson, A., & Andler, S. F. (2013). Traceable uncertainty. In Proceedings of 16th international conference on information fusion (pp. 1582–1589).

  • Strotmann, A., & Zhao, D. (2012). Author name disambiguation: What difference does it make in author-based citation analysis? Journal of the American Society for Information Science and Technology, 63(9), 1820–1833.

    Article  Google Scholar 

  • Synnestvedt, M. (2007). Data preparation for biomedical knowledge domain visualization: A probabilistic record linkage and information fusion approach to citation data. Doctoral thesis, Drexel University, Philadelphia.

  • Thomas, J., & Cook, K. (2006). A visual analytics agenda. IEEE Computer Graphics and Applications, 26(1), 10–13.

    Article  Google Scholar 

  • Torvik, V. I., & Smalheiser, N. R. (2007). A quantitative model for linking two disparate sets of articles in Medline. Bioinformatics, 23, 1658–1665.

    Article  Google Scholar 

  • Vakkari, S. P. (1996). Library and information science: Content and scope. In J. Olaisen, E. Munch-Petersen, & P. Wilson (Eds.), Information science: From development of the discipline to social interaction. Oslo: Scandinavian University Press.

    Google Scholar 

  • van Raan, A. F. J. (2006). Statistical properties of bibliometric indicators: research group indicator distributions and correlations. Journal of the American Society for Information Science, 57, 408–430.

    Article  Google Scholar 

  • Vickery, B. C., & Vickery, A. (1987). Information science in theory and practice. London: Butterworths.

    Google Scholar 

  • Walley, P. (1991). Statistical reasoning with imprecise probabilities. London: Chapman and Hall.

    Book  MATH  Google Scholar 

  • White, H. D. (2007). Combining bibliometrics, information retrieval, and relevance theory, part 2: Some implications for information science. Journal of the American Society for Information Science and Technology, 58(4), 583–605.

    Article  Google Scholar 

  • White, H. D., & McCain, K. W. (1998). Visualizing a discipline: An author co-citation analysis of information science, 1972–1995. Journal of the American Society for Information Science, 49(4), 327–355.

    Google Scholar 

  • Wu, S. (2013). The weighted Condorcet fusion in information retrieval. Information Processing and Management, 49(1), 108–122.

    Article  Google Scholar 

  • Wu, Y., Yuan, G.-X., & Ma, K.-L. (2012). Visualizing flow of uncertainty through analytical processes. IEEE Transactions on Visualization and Computer Graphics, 18(12), 2526–2535.

    Article  Google Scholar 

  • Yang, K., & Meho, L. (2007). CiteSearch: Next-generation citation analysis. In R. Larson, E. Rasmussen, S. Sugimoto, & E. Toms (Eds.), Proceedings of the 7th ACM/IEE joint conference on digital libraries (pp. 101–102). New York: Association for Computing Machinery.

    Google Scholar 

  • Yermish, I. (1975). A citation based interactive associative information retrieval system. Ph. D. dissertation, University of Pennsylvania, Philadelphia, PA. Ann Arbor, MI: UMI.

  • Zaffalon, M. (2002). The naive credal classifier. Journal of Statistical Planning and Inference, 105, 5–21.

    Article  MATH  MathSciNet  Google Scholar 

  • Zhu, J., Yang, Y., Xie, Q., Wang, L., & Hassan, S.-U. (2014). Robust hybrid name disambiguation framework for large databases. Scientometrics, 98(3), 2255–2274.

  • Zuk, T., & Carpendale, S. (2007). Visualization of uncertainty and reasoning. Smart graphics. Lecture Notes in Computer Science, 4569, 164–177.

    Article  Google Scholar 

Download references

Acknowledgments

As part of the Information Fusion as an E-Service in Scholarly Information Use (INCITE) project, this research has been supported by Region Västra Götaland under grant RUN 612-0198-13, University of Borås and University of Skövde.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Björn Hammarfelt.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Karlsson, A., Hammarfelt, B., Steinhauer, H.J. et al. Modeling uncertainty in bibliometrics and information retrieval: an information fusion approach. Scientometrics 102, 2255–2274 (2015). https://doi.org/10.1007/s11192-014-1481-6

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-014-1481-6

Keywords

Navigation