Skip to main content

Predicting Future Links Between Disjoint Research Areas Using Heterogeneous Bibliographic Information Network

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9078))

Included in the following conference series:

Abstract

Literature-based discovery aims to discover hidden connections between previously disconnected research areas. Heterogeneous bibliographic information network (HBIN) provides a latent, semi-structured, bibliographic information model to signal the potential connections between scientific papers. This paper introduces a novel literature-based discovery method that builds meta path features from HBIN network to predict co-citation links between previously disconnected literatures. We evaluated the performance of our method in predicting future co-citation links between fish oil and Raynaud’s syndrome papers. Our experimental results showed that HBIN meta path features could predict future co-citation links between these papers with high accuracy (0.851 F-Measure; 0.845 precision; 0.857 recall), outperforming the existing document similarity algorithms such as LDA, TF-IDF, and Bibliographic Coupling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Piatetsky-Shapiro, G., Dheraba, C., Getoor, L., Grossman, R., Feldman, R., Zaki, M.: What are the grand challenges for data mining?: KDD-2006 panel report. SIGKDD Explor. Newslett. 8(2), 70–77 (2006)

    Article  Google Scholar 

  2. Smalheiser, N.R.: Literature-based discovery: Beyond the ABCs. J. Am. Soc. Inform. Sci. Tech. 63(2), 218–224 (2012)

    Article  Google Scholar 

  3. Kostoff, R.N., Block, J.A., Solka, J.L., Briggs, M.B., Rushenberg, R.L., Stump, J.A., Johnson, D., Lyons, T.J., Wyatt, J.R.: Literature-related discovery. Annu. Rev. Inform. Sci. Tech. 43(1), 1–71 (2009)

    Article  Google Scholar 

  4. Small, H.: Co-citation in the scientific literature: A new measure of the relationship between two documents. J. Am. Soc. Inform. Sci. 24(4), 265–269 (1973)

    Article  Google Scholar 

  5. Swanson, D.R.: Fish oil, Raynaud’s syndrome, and undiscovered public knowledge. Persp. Bio. Med. 30(1), 7–18 (1986)

    Article  Google Scholar 

  6. Swanson, D.R., Smalheiser, N.R.: An interactive system for finding complementary literatures: a stimulus to scientific discovery. Artif. Intell. 91(2), 183–203 (1997)

    Article  MATH  Google Scholar 

  7. Yetisgen-Yildiz, M., Pratt, W.: Using statistical and knowledge-based approaches for litera-ture-based discovery. J. Biomed. Inform. 39(6), 600–611 (2006)

    Article  Google Scholar 

  8. Bassecoulard, E., Zitt, M.: Patents and publications. In: Moed, H.F., Glanzel, W., Schmoch, U. (eds.) Handbook of Quantitative Science and Technology Research, Chap. 30, pp. 665–694. Springer (2005)

    Google Scholar 

  9. Wei, C.-P., Chen, K.-A., Chen, L.-C.: Mining biomedical literature and ontologies for drug repositioning discovery. In: Tseng, V.S., Ho, T.B., Zhou, Z.-H., Chen, A.L.P., Kao, H.-Y. (eds.) PAKDD 2014, Part II. LNCS (LNAI), vol. 8444, pp. 373–384. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  10. Cheng, L., Lin, H., Zhou, F., Yang, Z., Wang, J.: Enhancing the accuracy of knowledge dis-covery: a supervised learning method. BMC Bioinform. 15, S9 (2014)

    Article  Google Scholar 

  11. Hristovski, D., Friedman, C., Rindflesch, T., Peterlin, B.: Literature-based knowledge discovery using natural language processing. In: Bruza, P., Weeber, M. (eds.) Literature-Based Discovery, pp. 133–152. Springer (2008)

    Google Scholar 

  12. Cameron, D., Bodenreider, O., Yalamanchili, H., Danh, T., Vallabhaneni, S., Thirunarayan, K., Sheth, A.P., Rindflesch, T.C.: A graph-based recovery and decomposition of Swanson’s hypothesis using semantic predications. J. Biomed. Inform. 46(2), 238–251 (2013)

    Article  Google Scholar 

  13. Sun, Y., Han, J.: Mining heterogeneous information networks: principles and methodologies. Morgan & Claypool (2012)

    Google Scholar 

  14. Yu, X., Gu, Q., Zhou, M., Han, J.: Citation prediction in heterogeneous bibliographic net-works. In: 2012 SIAM Conference on Data Mining, Anaheim, pp. 1119–1130 (2012)

    Google Scholar 

  15. Liu, X., Yu, Y., Guo, C., Sun, Y., Gao, L.: Full-text based context-rich heterogeneous network mining approach for citation recommendation. In: 2014 ACM IEEE Joint Conference on Digital Libraries, London, pp 361–370 (2014)

    Google Scholar 

  16. Ren, X., Liu, J., Yu, X., Khandelwal, U., Gu, Q., Wang, L., Han, J.: ClusCite: effective cita-tion recommendation by information network-based clustering. In: 20th ACM International Conference on Knowledge Discovery and Data Mining, New York, pp. 821–830 (2014)

    Google Scholar 

  17. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newslett. 11(1), 10–18 (2009)

    Article  Google Scholar 

  18. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  19. Salton, G., McGill, M.J.: Introduction to modern information retrieval. McGraw-Hill (1983)

    Google Scholar 

  20. Kessler, M.M.: Bibliographic coupling between scientific papers. Am. Doc. 14(1), 10–25 (1963)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yakub Sebastian .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Sebastian, Y., Siew, EG., Orimaye, S.O. (2015). Predicting Future Links Between Disjoint Research Areas Using Heterogeneous Bibliographic Information Network. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18032-8_48

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18031-1

  • Online ISBN: 978-3-319-18032-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics