Skip to main content
Log in

Bibliographic automatic classification algorithm based on semantic space transformation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

In view of the Chinese bibliographic data mining application of Chinese bibliography, an improved semantic space transformation method is proposed. Firstly, the ICTCLAS system is used to preprocess the texts and construct lemma vectors based on word frequency features. Then, the frequency features of the word frequency and the frequency of the inverse frequency document are fused to construct the feature matrix of the training sample set. Then, the matrix is decomposed and transformed by the singular value to obtain a semantic space, which is for the goal of performing semantic space transformation on the text eigenvectors to obtain semantic vectors. Finally, a joint SVM classifier is constructed to automatically classify the semantic vectors corresponding to Chinese bibliography. Extensive experimental results show that the classification accuracy of this method is higher than the existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Borges EN, Becker K, Heuser CA, et al (2011) An automatic approach for duplicate bibliographic metadata identification using classification[C]// International Conference of the Chilean Computer Science Society. IEEE Computer Society, p 47–53

  2. Caruntu A, Nica A, Toderean G, et al (2006) An improved method for automatic classification of speech[C]// IEEE International Conference on Automation, Quality and Testing, Robotics. IEEE, p 448–451

  3. Cooper C, Lovell R, Husk K, Booth A, Garside R (2017) Supplementary search methods were more effective and offered better value than bibliographic database searching: a case study from public health and environmental enhancement. Res Synth Methods 3:1102–1125

  4. Chen D, Müller HM, Sternberg PW (2006) Automatic document classification of biological literature[J]. BMC Bio 7(1):370

  5. Chen P, He F, Guan Q et al (2016) Research on automatic classification algorithm of tobacco leaf based on fuzzy selection principle[J]. Journal of Chinese Agricultural Mechanization 12(2):144–148

  6. D'Aspremont A (2015) Predicting abnormal returns from news using text classification[J]. Quantitative Fin 15(6):999–1012

  7. Du, Jian-hai (2017) Automatic text classification algorithm based on gauss improved convolutional neural network. Journal of Computational Science 21:195–200

  8. Kopf S, Haenselmann T, Effelsberg W (2005) Shape-based posture and gesture recognition in videos. Storage and Retrieval Methods and Applications for Multimedia 2005, 18 January 2005. DBLP, San Jose

  9. He W, Zha S, Li L (2013) Social media competitive analysis and text mining: a case study in the pizza industry[J]. Int J Inf Manag 33(3):464–472

    Article  Google Scholar 

  10. Huh J, Yetisgen-Yildiz M, Pratt W (2013) Text classification for assisting moderators in online health communities[J]. J Biomed Inform 46(6):998–1005

    Article  Google Scholar 

  11. Lin YS, Jiang JY, Lee SJ (2014) A similarity measure for text classification and clustering[J]. IEEE Trans Knowl Data Eng 26(7):1575–1590

    Article  Google Scholar 

  12. Ma L, Gong G, Liu J (2012) Research on an automatic image annotation method based on semantic space spectral clustering[J]. Journal of Convergence Information Technology 7(11):74–80

    Article  Google Scholar 

  13. Mostafa MM (2013) More than words: social networks’ text mining for consumer brand sentiments[J]. Expert Syst Appl 40(10):4241–4251

    Article  Google Scholar 

  14. Murtagh F, Kurtz MJ (2016) The classification Society’s bibliography over four decades: history and content analysis[J]. J Classif 33(1):6–29

    Article  MathSciNet  Google Scholar 

  15. Nassirtoussi AK, Aghabozorgi S, Wah TY et al (2014) Text mining for market prediction: a systematic review[J]. Expert Syst Appl 41(16):7653–7670

    Article  Google Scholar 

  16. Sarker A, Gonzalez G (2015) Portable automatic text classification for adverse drug reaction detection via multi-corpus training[J]. J Biomed Inform 53:196–207

    Article  Google Scholar 

  17. Schiminovich S (1971) Automatic classification and retrieval of documents by means of a bibliographic pattern discovery algorithm [J]. Information Storage & Retri 6(6):417–435

  18. Uysal AK, Gunal S (2014) The impact of preprocessing on text classification[J]. Inf Process Manag 50(1):104–112

    Article  Google Scholar 

  19. Weldon SP (2013) Organizing knowledge in the Isis bibliography from Sarton to the early twenty-first century.[J]. Isis 104(3):540–550

    Article  Google Scholar 

  20. Wu D, Olson DL (2017) A TOPSIS data mining demonstration and application to credit scoring[J]. International Journal of Data Warehousing & Min 2(3):16–26

Download references

Acknowledgments

This work has received financial support by The Teaching Reform Project of Higher Education of Zhejiang Province (No.jg20160374).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hua-Feng Yu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, HF. Bibliographic automatic classification algorithm based on semantic space transformation. Multimed Tools Appl 79, 9283–9297 (2020). https://doi.org/10.1007/s11042-019-7400-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-7400-3

Keywords

Navigation