Skip to main content
Log in

Finalizing your reference list with machine learning

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

A high quality reference list is important to the overall quality of a research paper. However, it requires domain knowledge and is time consuming to generate a reference list with good coverage, representativeness, and timeliness due to the large amount and fast growing of publications. In this paper, we deal with the specific problem of reference enhancement of research manuscripts with machine learning. A predictive model is trained by a large academic dataset with paper-related and venue-related information to discover additional references for a scientific draft with related information including an initial reference list. We propose a supervised approach called RefCom under the framework of learning-to-rank to predict the probability for a given paper to cite a reference candidate. Forty features in total are defined to describe pairs of papers with respect to author influence, venue influence and paper influence, as well as content and reference similarity. Unlike heuristic rule-based approaches, RefCom is able to integrate multiple features with learned weights. Experimental study with the AMiner dataset which contains 2 million papers and 1.7 million authors show the effectiveness of RefCom in citation prediction, suggesting its potential of being applied as an assistant tool in reference finalization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. https://www.microsoft.com/en-us/research/project/microsoft-academic-graph/.

  2. https://aminer.org/.

References

  • Avancini H, Straccia U (2004) Personalization, collaboration, and recommendation in the digital library environment cyclades. In: Proceedings of IADIS conference on applied computing, pp 67–74

  • Cao Z, Qin T, Liu TY, Tsai MF, Li H (2007) Learning to rank: from pairwise approach to listwise approach. In: Proceedings of the 24th international conference on machine learning, pp 129–136

  • Cao J, Zhang K, Luo M, Yin C, Lai X (2016) Extreme learning machine and adaptive sparse representation for image classification. Neural Netw 81:91–102

    Article  PubMed  Google Scholar 

  • Champiri ZD, Shahamiri SR, Salim SSB (2015) A systematic review of scholar context-aware recommender systems. Expert Syst Appl 42(3):1743–1758

    Article  Google Scholar 

  • Chang CC, Chen RS (2006) Using data mining technology to solve classification problems: a case study of campus digital library. Electron Lib 24(3):307–321

    Article  Google Scholar 

  • Chen CC, Chen AP (2007) Using data mining technology to provide a recommendation service in the digital library. Electron Lib 25(6):711–724

    Article  Google Scholar 

  • Chen H, Martinez J, Ng TD, Schatz BR (1997) A concept space approach to addressing the vocabulary problem in scientific information retrieval: an experiment on the worm community system. J Am Soc Inf Sci 48(1):17–31

    Article  CAS  Google Scholar 

  • Ding Y, Chowdhury G, Foo S (2000) Organising keywords in a web search environment: a methodology based on co-word analysis. Adv Knowl Org 7:28–34

    Google Scholar 

  • Geng X, Liu TY, Qin T, Li H (2007) Feature selection for ranking. In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, pp 407–414

  • Liao IE, Hsu WC, Cheng MS, Chen LP (2010) A library recommender system based on a personal ontology model and collaborative filtering technique for english collections. Electron Lib 28(3):386–400

    Article  Google Scholar 

  • Liu TY (2009) Learning to rank for information retrieval. Foundations and trends®. Inf Retr 3(3):225–331

    CAS  Google Scholar 

  • Lü L, Zhou T (2011) Link prediction in complex networks: a survey. Phys A Stat Mech Appl 390(6):1150–1170

    Article  Google Scholar 

  • Schatz BR, Johnson EH, Cochrane PA, Chen H (1996) Interactive term suggestion for users of digital libraries: using subject thesauri and co-occurrence lists for information retrieval. In: ACM international conference on digital libraries, pp 126–133

  • Schwarzer M, Schubotz M, Meuschke N, Breitinger C, Markl V, Gipp B (2016) Evaluating link-based recommendations for wikipedia. In: ACM/IEEE-CS joint conference on digital libraries, pp 191–200

  • Stallings J, Vance E, Yang J, Vannier MW, Liang J, Pang L, Dai L, Ye I, Wang G (2013) Determining scientific impact using a collaboration index. Proc Natl Acad Sci 110(24):9680–9685

    Article  MathSciNet  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  • Tang J, Yao L, Zhang D, Zhang J (2010) A combination approach to web user profiling. ACM Trans Knowl Discov Data 5:1–44

    Article  Google Scholar 

  • Tang J, Zhang J, Yao L, Li J, Zhang L, Su Z (2008) Arnetminer: extraction and mining of academic social networks. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 990–998

  • Tsai CS, Chen MY (2008) Using adaptive resonance theory and data-mining techniques for materials recommendation based on the e-Library environment. Electron Lib 26(3):287–302

    Article  Google Scholar 

  • Xia L, Xu J, Lan Y, Guo J, Cheng X (2015) Learning maximal marginal relevance model via directly optimizing diversity evaluation measures. In: Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, pp 113–122

Download references

Acknowledgements

This project was supported by the National Natural Science Foundation of China (Grant no. 61502420), and the Natural Science Foundation of Zhejiang Province (Grant no. LY16F020032).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian-Ping Mei.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Table 2 gives the complete list of features used in RefCom.

Table 2 Description of each feature

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mei, JP., Chen, D., Fan, J. et al. Finalizing your reference list with machine learning. J Ambient Intell Human Comput 14, 14883–14892 (2023). https://doi.org/10.1007/s12652-018-0976-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-018-0976-z

Keywords

Navigation