Detecting impact factor manipulation with data mining techniques
- 453 Downloads
Disingenuously manipulating impact factor is the significant way to harm the fairness of impact factor. That behavior should be banned with effective means. In this paper, data mining techniques are used to solve this problem. Firstly, ten features are collected into feature set for nine normal journals and nine abnormal journals from 2005 to 2014. Then, three types of strong classification methods, k-nearest neighbor, decision tree and support vector machine are adopted to learn the well classification models. Moreover, eight algorithms are run on the data set to find out suitable methods for detecting impact factor manipulation in our experiment. Finally, two excellent algorithms in performance with precisions higher than 85 % are picked out and used to predict new journal samples. According to the results, random forest and one type of support vector machine are relatively more suitable than k-nearest neighbor in this case of detecting abnormal journals. When using those two methods to recognize other 90 journals in the field of nine disciplines from 2007 to 2014, they are verified to be broadly applicable. Unfortunately, four journals are recognized to be manipulated in some years. Therefore, in this paper, two data mining methods are discovered to be intelligent and automatic ways to detect and ban impact factor manipulation for journal managers.
KeywordsImpact factor Manipulation Data mining Classification Prediction
The authors would like to thank the editor and anonymous referees for their constructive comments that substantially helped improve the quality and presentation of this paper. This work was supported by the National Natural Science Foundation of China (Grant Nos. 71501040, 71473034), and the Fundamental Research Funds for the Central Universities (2242014K10020).