Skip to main content
Log in

Important citation identification by exploiting the syntactic and contextual information of citations

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Citations are not equally important. Researchers presented different models and techniques to identify important citations. However, the features used in these work are relatively limited, so they cannot achieve good recognition performance. This paper proposed a new machine learning framework to distinguish important and non-important citations by examining the syntactic and contextual information of citations. Among them, syntactic features reflect the statistical perspective characteristics brought by citation behavior, such as the cited frequency and citation position of the cited article in the citing ones. Contextual features reflect the semantic content characteristics brought by citations, such as the intent and polarity of citations. Three feature selection algorithms, Pearson correlation coefficient, relief-F and entropy weight method, were used to calculate the contribution of each index on distinguishing different kinds of citations. On this basis, key features that can better identify the important citations were screened out. Three classifiers of support vector machine, KNN and random forest were used to test the classification performance of these key features. The experiment was performed on two annotated benchmark datasets. It showed that the framework proposed in this paper can achieve better classification performance compared with contemporary state-of-the-art research. The syntactic and contextual features of citation are of great value in identifying important citations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Abu-Jbara, A., Ezra, J., & Radev, D. (2013). Purpose and polarity of citation: Towards nlp-based bibliometrics. In Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: Human language technologies (pp. 596–606). Atlanta, Georgia.

  • Abu-Jbara, A., & Radev, D. (2011). Coherent citation-based summarization of scientific papers. In Proceedings of the 49th annual meeting of the association for computational linguistics (Vol. 1, pp. 500––509). Stroudsburg, PA: Association for Computational Linguistics.

  • Agarwal, S., Choubey, L., & Yu, H. (2010). Automatically classifying the role of citations in biomedical articles. In AMIA annual symposium proceedings. AMIA symposium (pp. 11–15).

  • Ain, Q. U., Riaz, H., & Afzal, M. T. (2019). Evaluation of h-index and its citation intensity based variants in the field of mathematics. Scientometrics, 119(1), 187–211.

    Google Scholar 

  • An, J., Kim, N., Kan, M. Y., Chandrasekaran, M. K., & Song, M. (2017). Exploring characteristics of highly cited authors according to citation location and content. Journal of the Association for Information Science and Technology, 68(8), 1975–1988.

    Google Scholar 

  • Anderson, R., Narin, F., & McAllister, P. (1978). Publication ratings versus peer ratings of universities. Journal of the American Society for Information Science, 29(2), 91–103.

    Google Scholar 

  • Arkaitz, A., Andoni, B., & Manuel, G. (2018). Predictive models for hospital readmission risk: a systematic review of methods. Computer Methods and Programs in Biomedicine, 164, 49–64.

    Google Scholar 

  • Athar, A. (2011). Sentiment analysis of citations using sentence structure-based features. In Proceedings of the ACL 2011 student session. Portland, Oregon: Association for Computational Linguistics (pp. 81–87).

  • Ayaz, S., & Afzal, M. T. (2016). Identification of conversion factor for completing-h index for the field of mathematics. Scientometrics, 109(3), 1511–1524.

    Google Scholar 

  • Bakhti, K., Niu, Z., Yousif, A., & Nyamawe, A. S. (2018). Citation Function Classification Based on Ontologies and Convolutional Neural Networks. In L. Uden, D. Liberona, & J. Ristvej (Eds.), Paper presented at the international workshop on learning technology for education in cloud, communications in computer and information science (pp. 105–115). Cham: Springer.

    Google Scholar 

  • Bertin, M., Atanassova, I., Sugimoto, C. R., & Lariviere, V. (2016). The linguistic patterns and rhetorical structure of citation context: an approach using n-grams. Scientometrics, 109(3), 1417–1434.

    Google Scholar 

  • Bi, H. H., Wang, J., & Lin, D. K. (2011). Comprehensive citation index for research networks. IEEE Transactions on Knowledge and Data Engineering, 23(8), 1274–1278.

    Google Scholar 

  • Boyack, K. W., van Eck, N. J., Colavizza, G., & Waltman, L. (2018). Characterizing in-text citations in scientific articles: A large-scale analysis. Journal of Informetrics, 12(1), 59–73.

    Google Scholar 

  • Breiman, L. (2001). Random forests. Machine learning, 45(1), 5–32.

    Google Scholar 

  • Chen, Z. M., Yeh, Y. L., & Chen, T. C. (2018). Assessment of a regional flood disaster indicator via an entropy weighting method. Natural Hazards Review, 19(2), 05018002.

    MathSciNet  Google Scholar 

  • Chi, P. S., & Glänzel, W. (2017). An empirical investigation of the associations among usage, scientific collaboration and citation impact. Scientometrics, 112(1), 403–412.

    Google Scholar 

  • Chi, P. S., & Glänzel, W. (2018). Comparison of citation and usage indicators in research assessment in scientific disciplines and journals. Scientometrics, 116(1), 537–554.

    Google Scholar 

  • Cohan, A., Ammar, W., Zuylen, M.V., & Cady, F. (2019). Structural scaffolds for citation intent classification in scientific publications. In NAACL-HLT, association for computational linguistics (pp. 3586–3596). Minneapolis, Minnesota.

  • Councill, I. G., Giles, C. L., & Kan, M. Y. (2008). ParsCit: an open-source CRF reference string parsing package. LREC, 8, 661–667.

    Google Scholar 

  • Cozzens, S. (1989). What do citations count? The rhetoric-first model. Scientometrics, 15(5–6), 437–447.

    Google Scholar 

  • Ding, Y., Liu, X., Guo, C., & Cronin, B. (2013). The distribution of references across texts: Some implications for citation analysis. Journal of Informetrics, 7(3), 583–592.

    Google Scholar 

  • Ding, Y., Zhang, G., Chambers, T., Song, M., Wang, X., & Zhai, C. (2014). Content-based citation analysis: The next generation of citation analysis. Journal of the Association for Information Science and Technology, 65(9), 1820–1833.

    Google Scholar 

  • Dong, C., & Schäfer, U. (2011). Ensemble-style self-training on citation classification. In Proceedings of 5th international joint conference on natural language processing (pp. 623–631). Chiang Mai, Thailand.

  • Dong, Q., Ai, X., Cao, G., Zhang, Y., & Wang, X. (2010). Study on risk assessment of water security of drought periods based on entropy weight methods. Kybernetes, 39(6), 864–870.

    Google Scholar 

  • Dudani, S. A. (1976). The distance-weighted k-nearest neighbor rule. IEEE Transactions on System Man and Cybernetics, 6(4), 325–327.

    Google Scholar 

  • Fawcett, T. (2006). An introduction to roc analysis. Pattern Recognition Letters, 27(8), 861–874.

    MathSciNet  Google Scholar 

  • Finney, B. (1979). The reference characteristics of scientific texts. Master’s thesis. London: The City University of London.

  • Garfield, E. (1965). Can citation indexing be automated. In D. C. Washington (Ed.), Statistical association methods for mechanized documentation, symposium proceedings (Vol. 269, pp. 189–192). National Bureau of Standards, Miscellaneous: Publication.

    Google Scholar 

  • Garzone, M., & Mercer, R. (2000). Towards an automated citation classifier. In Conference of the Canadian society for computational studies of intelligence (pp. 346–337). Berlin: Springer.

  • Glänzel, W. (2011). The application of characteristic score and scales to the evaluation and ranking of scientific journals. Journal of Information Science, 37(1), 40–48.

    MathSciNet  Google Scholar 

  • Glänzel, W., Debackere, K., Thijs, B., & Schubert, A. (2006). A concise review on the role of author self-citations in information science, bibliometrics and science policy. Scientometrics, 67(2), 263–277.

    Google Scholar 

  • Glänzel, W., & Heeffer, S. (2014). Cross-national preferences and similarities in downloads and citations of scientific articles: a pilot study. In E. Noyons (Ed.), Proceedings of the STI conference 2014, Leiden (pp. 207–215).

  • Glänzel, W., & Moed, H. F. (2002). Journal impact measures in bibliometric research. Scientometrics, 53(2), 171–193.

    Google Scholar 

  • Glänzel, W., Rousseau, R., & Zhang, L. (2012). A visual representation of relative first-citation times. Journal of the American Society for Information Science and Technology, 63(7), 1420–1425.

    Google Scholar 

  • Glänzel, W., & Thijs, B. (2004). The influence of author self-citations on bibliometric macro indicators. Scientometrics, 59(3), 281–310.

    Google Scholar 

  • Glänzel, W., & Thijs, B. (2018). The role of baseline granularity for benchmarking citation impact. The case of CSS profiles. Scientometrics, 116(1), 521–536.

    Google Scholar 

  • Glänzel, W., Thijs, B., & Debackere, K. (2014). The application of citation-based performance classes to the disciplinary and multidisciplinary assessment in national comparison and institutional research assessment. Scientometrics, 101(2), 939–952.

    Google Scholar 

  • Hall, M. (2000). Correlation-based feature selection for machine learning (p. 19). Department of Computer Science.

  • Hamedani, M. R., Kim, S. W., & Kim, D. J. (2016). SimCC: a novel method to consider both content and citations for computing similarity of scientific papers. Information Sciences, 334, 273–292.

    Google Scholar 

  • Hassan, S. U., Akram, A., & Haddawy, P. (2017). Identifying important citations using contextual information from full text. In 2017 ACM/IEEE joint conference on digital libraries (JCDL) (pp. 1–8). IEEE. Ontario, Canada.

  • Hassan, S. U., Safder, I., Akram, A., & Kamiran, F. (2018). A novel machine-learning approach to measuring scientific knowledge flows using citation context analysis. Scientometrics, 116(2), 973–996.

    Google Scholar 

  • Hernández-Álvarez, M., Gómez Soriano, J., & Martínez-Barco, P. (2016). Annotated corpus for citation context analysis. Latin American Journal of Computing Faculty of Systems Engineering National Polytechnic School Quito-Ecuador, 3(1), 35–42.

    Google Scholar 

  • Hernández-Álvarez, M., Soriano, J. M. G., & Martínez-Barco, P. (2017). Citation function, polarity and influence classification. Natural Language Engineering, 23(4), 561–588.

    Google Scholar 

  • Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences, 102(46), 16569–16572.

    MATH  Google Scholar 

  • Hou, W. R., Li, M., & Niu, D. K. (2011). Counting citations in texts rather than reference lists to improve the accuracy of assessing scientific contribution: Citation frequency of individual articles in other papers more fairly measures their scientific contribution than mere presence in reference lists. BioEssays, 33(10), 724–727.

    Google Scholar 

  • Hu, Z., Chen, C., & Liu, Z. (2015). The recurrence of citations within a scientific article. In A. A. Salah, Y. Tonta, A. A. A. Salah, C. Sugimoto, & U. Al (Eds.), The 15th conference of the international society for scientometrics and informetrics (pp. 221–229). Istanbul, Turkey: ISSI, Boaziçi University Printhouse.

  • Huang, M., Sun, L., Xu, J., & Zhang, S. (2020). Multilabel feature selection using relief and minimum redundancy maximum relevance based on neighborhood rough sets. IEEE Access, PP(99), 1.

    Google Scholar 

  • Huang, S., Chang, J., Leng, G., & Huang, Q. (2015). Integrated index for drought assessment based on variable fuzzy set theory: A case study in the Yellow River basin. Journal of Hydrology, 527, 608–618.

    Google Scholar 

  • Ikram, M. T., & Afzal, M. T. (2019). Aspect based citation sentiment analysis using linguistic patterns for better comprehension of scientific knowledge. Scientometrics, 119(1), 73–95.

    Google Scholar 

  • Jha, R., Jbara, A.-A., Qazvinian, V., & Radev, D. R. (2017). NLP-driven citation analysis for scientometrics. Natural Language Engineering, 23(1), 93–130.

    Google Scholar 

  • Jochim, C., & Schütze, H. (2012). Towards a generic and flexible citation classifier based on a faceted classification scheme. In Proceedings of COLING’12. Mumbai, India: COLING’12 (pp. 1343–1358).

  • Jurgens, D., Kumar, S., Hoover, R., Mcfarland, D., & Jurafsky, D. (2016). Citation classification for behavioral analysis of a scientific field. arXiv preprint arXiv:1609.00435.

  • Kim, M., Baek, I., & Song, M. (2018). Topic diffusion analysis of a weighted citation network in biomedical literature. Journal of the Association for Information Science and Technology, 69(2), 329–342.

    Google Scholar 

  • King, D. A. (2004). The scientific impact of nations what different countries get for their research spending. Nature, 430, 311–316.

    Google Scholar 

  • Kira, K., & Rendell, L. A. (1992). The feature selection problem: Traditional methods and a new algorithm. Aaai, 2, 129–134.

    Google Scholar 

  • Kononenko, I. (1994). Estimating attributes: Analysis and extensions of RELIEF. European conference on machine learning (pp. 171–182). Berlin: Springer.

    Google Scholar 

  • Korn, F., Pagel, B.-U., & Faloutsos, C. (2001). On the ‘dimensionality curse’ and the ‘self-similarity blessing’. IEEE TKDE, 13, 96–111.

    Google Scholar 

  • Kumar, S. (2016). Structure and dynamics of signed citation networks. In Proceedings of the 25th international conference companion on world wide web. International World Wide Web Conferences Steering Committee. Montreal, Canada (pp. 63–64).

  • Li, X., He, Y., Meyers, A., & Grishman, R. (2013). Towards fine-grained citation function classification. In Proceedings of recent advances in natural language processing. Hissar, Bulgaria (pp. 402–407).

  • Lin, C. S., Huang, M. H., & Chen, D. Z. (2013). The influences of counting methods on university rankings based on paper count and citation count. Journal of Informetrics, 7(3), 611–621.

    Google Scholar 

  • Ling, C. X., Huang, J., & Zhang, H. (2003). AUC: A better measure than accuracy in comparing learning algorithms. Advances in artificial intelligence. In Proceedings of the 16th conference of the canadian society for computational studies of intelligence, AI 2003, Halifax, Canada (pp. 329–341).

  • Lu, C., Ding, Y., & Zhang, C. (2017). Understanding the impact change of a highly cited article: a content-based citation analysis. Scientometrics, 112(2), 927–945.

    Google Scholar 

  • Luo, F., Sun, A., Erdt, M., Raamkumar, A. S., & Theng, Y. L. (2018). Exploring prestigious citations sourced from top universities in bibliometrics and altmetrics: a case study in the computer science discipline. Scientometrics, 114(1), 1–17.

    Google Scholar 

  • May, R. M. (1997). The scientific wealth of nations. Science, 275, 793–796.

    Google Scholar 

  • Mazloumian, A., Helbing, D., Lozano, S., Light, R. P., & Börner, K. (2013). Global multi-level analysis of the ‘Scientific Food Web’. Scientific reports, 3, 1167.

    Google Scholar 

  • Mei, Q., & Zhai, C. (2008). Generating impact-based summaries for scientific literature. In Proceedings of ACL-08: HLT (pp. 816–824). Columbus.

  • Mohammad, S., Dorr, B., Egan, M., Hassan, A., Muthukrishan, P., Qazvinian, V., et al. (2009). Using citations to generate surveys of scientific paradigms. In Proceedings of human language technologies: The 2009 annual conference of the North American chapter of the association for computational linguistics. Association for Computational Linguistics (pp. 584–592). Boulder.

  • Moravcsik, J. M., & Murugesan, P. (1975). Some results on the function and quality of citations. Social Studies of Science, 5(1), 88–91.

    Google Scholar 

  • Pak, C., Yu, G., & Wang, W. (2018). A study on the citation situation within the citing paper: citation distribution of references according to mention frequency. Scientometrics, 114(3), 905–918.

    Google Scholar 

  • Peldszus, A. (2014). Towards segment-based recognition of argumentation structure in short texts. In Proceedings of the first workshop on argumentation mining (pp. 88–97). Baltimore, Maryland.

  • Peritz, B. C. (1983). A classification of citation roles for the social sciences and related fields. Scientometrics, 5(5), 303–312.

    Google Scholar 

  • Pham, D. H., & Le, A. C. (2018). Learning multiple layers of knowledge representation for aspect based sentiment analysis. Data & Knowledge Engineering, 114, 26–39.

    Google Scholar 

  • Pham, S., & Hofmann, A. (2003). A new approach for scientific citation classification using cue phrases. In L. C. C. F. Tamás Domonkos Gedeon (Ed.), AI 2003: Advances in artificial intelligence. Lecture notes in computer science, 2903 (pp. 759–771). Berlin: Springer.

  • Pinski, G., & Narin, F. (1976). Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Processing and Management, 12(5), 297–312.

    Google Scholar 

  • Poria, S., Cambria, E., & Gelbukh, A. (2016). Aspect extraction for opinion mining with a deep convolutional neural network. Knowledge-Based Systems, 108, 42–49.

    Google Scholar 

  • Pride, D., & Knoth, P. (2017). Incidental or influential? A decade of using text-mining for citation function classification. In Proceedings of 16th international society of scientometrics and informetrics conference (pp. 1357–1367). Wuhan, China.

  • Qayyum, F., & Afzal, M. T. (2019). Identification of important citations by exploiting research articles’ metadata and cue-terms from content. Scientometrics, 118(1), 21–43.

    Google Scholar 

  • Raheel, M., Ayaz, S., & Afzal, M. T. (2018). Evaluation of h-index, its variants and extensions based on publication age & citation intensity in civil engineering. Scientometrics, 114(3), 1107–1127.

    Google Scholar 

  • Sendhilkumar, S., Elakkiya, E., & Mahalakshmi, G. S. (2013). Citation semantic based approaches to identify article quality. In Proceedings of international conference ICCSEA (pp. 411–420). Delhi.

  • Sibaroni, Y., Widyantoro, D. H., & Khodra, M. L. (2016). Extend relation identification in scientific papers based on supervised machine learning. In International conference on advanced computer science & information systems (pp. 379–384). IEEE.

  • Small, H. (2011). Interpreting maps of science using citation context sentiments: a preliminary investigation. Scientometrics, 87(2), 373–388.

    Google Scholar 

  • Sombatsompop, N., Kositchaiyong, A., Markpin, T., & Inrit, S. (2006). Scientific evaluations of citation quality of international research articles in the SCI database: Thailand case study. Scientometrics, 66(3), 521–535.

    Google Scholar 

  • Spiegel-Rusing, I. (1977). Science studies: Bibliometric and content analysis. Social Studies of Science, 7(1), 97–113.

    Google Scholar 

  • Sternitzke, C., & Bergmann, I. (2009). Similarity measures for document mapping: A comparative study on the level of an individual scientist. Scientometrics, 78(1), 113–130.

    Google Scholar 

  • Sun, L., Zhang, X., Qian, Y., Xu, J., & Zhang, S. (2019). Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification. Information Sciences, 502, 18–41.

    MathSciNet  MATH  Google Scholar 

  • Tang, R., & Safer, M. A. (2008). Author-rated importance of cited references in biology and psychology publications. Journal of Documentation, 64(2), 246–272.

    Google Scholar 

  • Taşkın, Z., & Al, U. (2018). A content-based citation analysis study based on text categorization. Scientometrics, 114(1), 335–357.

    Google Scholar 

  • Teng, W., Cheng, L. P., & Zhao, K. J. (2017). Application of kernel principal component and Pearson correlation coefficient in prediction of mine pressure failure. In Proceedings of the Chinese automation congress (pp. 20–22). Jinan.

  • Teufel, S. (2010). The structure of scientific articles: Applications to citation indexing and summarization. Stanford, CA: CSLI Publications.

    Google Scholar 

  • Teufel, S., Siddharthan, A., & Tidhar, D. (2006). Automatic classification of citation function. In Proceedings of the 2006 conference on empirical methods in natural language processing (pp. 103–110). Stroudsburg, PA: Association for Computational Linguistics.

  • Valenzuela, M., Ha, V., & Etzioni, O. (2015). Identifying meaningful citations. In 29th AAAI conference on artificial intelligence, AAAI 2015 (pp. 21–26). Austin, TX.

  • Valverde-Albacete, F. J., & Peláez-Moreno, Carmen. (2014). 100% classification accuracy considered harmful: the normalized information transfer factor explains the accuracy paradox. PLoS ONE, 9, e84217.

    Google Scholar 

  • Vinkler, P. (2018). Structure of the scientific research and science policy. Scientometrics, 114(2), 737–756.

    Google Scholar 

  • Voos, H., & Dagaev, K. S. (1976). Are all citations equal? Or, did we op. cit. your idem? Journal of Academic Librarianship, 1(6), 19–21.

    Google Scholar 

  • Wan, X., & Liu, F. (2014). Are all literature citations equally important? Automatic citation strength estimation and its applications. Journal of the Association for Information Science and Technology, 65(9), 1929–1938.

    Google Scholar 

  • Wang, J., Thijs, B., & Glänzel, W. (2015a). Characteristics of international collaboration in sport sciences publications and its influence on citation impact. Scientometrics, 105(2), 843–862.

    Google Scholar 

  • Wang, J., Thijs, B., & Glänzel, W. (2015b). Interdisciplinarity and impact: Distinct effects of variety, balance, and disparity. PLoS ONE, 10(5), e0127298.

    Google Scholar 

  • Wang, M. Y., Jiao, S. J., Chai, K. H., & Chen, G. S. (2019a). Building journal’s long-term impact: using indicators detected from the sustained active articles. Scientometrics, 121(1), 261–283.

    Google Scholar 

  • Wang, M. Y., Leng, D. T., Ren, J. J., Zeng, Y. M., & Chen, G. S. (2019b). Sentiment classification based on linguistic patterns in citation context. Current Science, 117(4), 606–616.

    Google Scholar 

  • Wang, M. Y., Li, S., & Chen, G. S. (2017). Detecting latent referential articles based on their vitality performance in the latest 2 years. Scientometrics, 112(3), 1557–1571.

    Google Scholar 

  • Wang, M. Y., Ren, J. J., Li, S., & Chen, G. S. (2019c). Quantifying a paper’s academic impact by distinguishing the unequal intensities and contributions of citations. IEEE Access, 7, 96198–96214.

    Google Scholar 

  • Wang, M. Y., Wang, Z. Y., & Chen, G. S. (2019d). Which can better predict the future success of articles? Bibliometric indices or alternative metrics. Scientometrics, 119(3), 1575–1595.

    MathSciNet  Google Scholar 

  • Wang, M. Y., Yu, G., An, S., & Yu, D. R. (2012a). Discovery of factors influencing citation impact based on a soft fuzzy rough set model. Scientometrics, 93(3), 635–644.

    Google Scholar 

  • Wang, M. Y., Yu, G., Xu, J. Z., He, H. X., Yu, D. R., & An, S. (2012b). Development a case-based classifier for predicting highly cited paper. Journal of Informetrics, 6(4), 586–599.

    Google Scholar 

  • Wang, M. Y., Zhang, J. Q., Jiao, S. J., & Zhang, T. Y. (2019e). Evaluating the impact of citations of articles based on knowledge flow patterns hidden in the citations. PLoS ONE, 14(11), e0225276.

    Google Scholar 

  • Yousif, A., Niu, Z., Tarus, J. K., & Ahmad, A. (2017). A survey on sentiment analysis of scientific citations. Artificial Intelligence Review, 52, 1–34.

    Google Scholar 

  • Yu, T., Yu, G., & Wang, M. Y. (2014). Classification method for detecting coercive self-citation in journals. Journal of Informetrics, 8(1), 123–135.

    Google Scholar 

  • Zhang, J., Ning, Z., Bai, X., Kong, X., Zhou, J., & Xia, F. (2017). Exploring time factors in measuring the scientific impact of scholars. Scientometrics, 112(3), 1301–1321.

    Google Scholar 

  • Zhang, L., Thijs, B., & Glänzel, W. (2011). The diffusion of H-related literature. Journal of Informetrics, 5(4), 583–593.

    Google Scholar 

  • Zhang, P., Liu, G., & Gao, W. (2019). Distinguishing two types of labels for multi-label feature selection. Pattern Recognit, 95, 72–82.

    Google Scholar 

  • Zhao, D., & Strotmann, A. (2016). Dimensions and uncertainties of author citation rankings: Lessons learned from frequency-weighted in-text citation counting. Journal of the Association for Information Science and Technology, 67(3), 671–682.

    Google Scholar 

  • Zhu, P., Xu, Q., Hu, Q., Zhang, C., & Zhao, H. (2017). Multi-label feature selection with missing labels. Pattern Recognit, 74, 488–502.

    Google Scholar 

  • Zhu, X., Turney, P., Lemire, D., & Vellino, A. (2015). Measuring academic influence: Not all citations are equal. Journal of the Association for Information Science and Technology, 66(2), 408–427.

    Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 71473034; 717D1063), the Heilongjiang Provincial Natural Science Foundation of China (Grant No. LH2019G001), the financial assistance from Postdoctoral Scientific Research Developmental Fund of Heilongjiang Province (Grant No. LBH-Q16003), and Heilongjiang Province Art Planning Project: Research on Discipline Theme Evolution Based on Multi-source Data Fusion (Grant No. 2019C027).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guangsheng Chen.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 17 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, M., Zhang, J., Jiao, S. et al. Important citation identification by exploiting the syntactic and contextual information of citations. Scientometrics 125, 2109–2129 (2020). https://doi.org/10.1007/s11192-020-03677-1

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-020-03677-1

Keywords

Navigation