Abstract
In view of the Chinese bibliographic data mining application of Chinese bibliography, an improved semantic space transformation method is proposed. Firstly, the ICTCLAS system is used to preprocess the texts and construct lemma vectors based on word frequency features. Then, the frequency features of the word frequency and the frequency of the inverse frequency document are fused to construct the feature matrix of the training sample set. Then, the matrix is decomposed and transformed by the singular value to obtain a semantic space, which is for the goal of performing semantic space transformation on the text eigenvectors to obtain semantic vectors. Finally, a joint SVM classifier is constructed to automatically classify the semantic vectors corresponding to Chinese bibliography. Extensive experimental results show that the classification accuracy of this method is higher than the existing methods.
Similar content being viewed by others
References
Borges EN, Becker K, Heuser CA, et al (2011) An automatic approach for duplicate bibliographic metadata identification using classification[C]// International Conference of the Chilean Computer Science Society. IEEE Computer Society, p 47–53
Caruntu A, Nica A, Toderean G, et al (2006) An improved method for automatic classification of speech[C]// IEEE International Conference on Automation, Quality and Testing, Robotics. IEEE, p 448–451
Cooper C, Lovell R, Husk K, Booth A, Garside R (2017) Supplementary search methods were more effective and offered better value than bibliographic database searching: a case study from public health and environmental enhancement. Res Synth Methods 3:1102–1125
Chen D, Müller HM, Sternberg PW (2006) Automatic document classification of biological literature[J]. BMC Bio 7(1):370
Chen P, He F, Guan Q et al (2016) Research on automatic classification algorithm of tobacco leaf based on fuzzy selection principle[J]. Journal of Chinese Agricultural Mechanization 12(2):144–148
D'Aspremont A (2015) Predicting abnormal returns from news using text classification[J]. Quantitative Fin 15(6):999–1012
Du, Jian-hai (2017) Automatic text classification algorithm based on gauss improved convolutional neural network. Journal of Computational Science 21:195–200
Kopf S, Haenselmann T, Effelsberg W (2005) Shape-based posture and gesture recognition in videos. Storage and Retrieval Methods and Applications for Multimedia 2005, 18 January 2005. DBLP, San Jose
He W, Zha S, Li L (2013) Social media competitive analysis and text mining: a case study in the pizza industry[J]. Int J Inf Manag 33(3):464–472
Huh J, Yetisgen-Yildiz M, Pratt W (2013) Text classification for assisting moderators in online health communities[J]. J Biomed Inform 46(6):998–1005
Lin YS, Jiang JY, Lee SJ (2014) A similarity measure for text classification and clustering[J]. IEEE Trans Knowl Data Eng 26(7):1575–1590
Ma L, Gong G, Liu J (2012) Research on an automatic image annotation method based on semantic space spectral clustering[J]. Journal of Convergence Information Technology 7(11):74–80
Mostafa MM (2013) More than words: social networks’ text mining for consumer brand sentiments[J]. Expert Syst Appl 40(10):4241–4251
Murtagh F, Kurtz MJ (2016) The classification Society’s bibliography over four decades: history and content analysis[J]. J Classif 33(1):6–29
Nassirtoussi AK, Aghabozorgi S, Wah TY et al (2014) Text mining for market prediction: a systematic review[J]. Expert Syst Appl 41(16):7653–7670
Sarker A, Gonzalez G (2015) Portable automatic text classification for adverse drug reaction detection via multi-corpus training[J]. J Biomed Inform 53:196–207
Schiminovich S (1971) Automatic classification and retrieval of documents by means of a bibliographic pattern discovery algorithm [J]. Information Storage & Retri 6(6):417–435
Uysal AK, Gunal S (2014) The impact of preprocessing on text classification[J]. Inf Process Manag 50(1):104–112
Weldon SP (2013) Organizing knowledge in the Isis bibliography from Sarton to the early twenty-first century.[J]. Isis 104(3):540–550
Wu D, Olson DL (2017) A TOPSIS data mining demonstration and application to credit scoring[J]. International Journal of Data Warehousing & Min 2(3):16–26
Acknowledgments
This work has received financial support by The Teaching Reform Project of Higher Education of Zhejiang Province (No.jg20160374).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yu, HF. Bibliographic automatic classification algorithm based on semantic space transformation. Multimed Tools Appl 79, 9283–9297 (2020). https://doi.org/10.1007/s11042-019-7400-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-7400-3