Abstract
The fast developments on the computer and networking technologies have made the Internet become the largest medium of information in the word at present. Many companies hope to be able to timely and effective access to information from the Internet. Efficient webpages classification system is needed. According to the classification requirements, we use LDA-SVM model for elaborate web category classification. And we discuss the impact of topic number K in LDA to the classification. The experiments show our method is efficient.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cortes Corinna, Vapnik Vladimir (1995) Support-vector networks. Mach Learn 20(3):273–297
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3
Ahmadi A, Fotouhi M, Khaleghi M (2011) Intelligent classification of web pages using contextual and visual features. Appl Soft Comput 11(2)
Abdelhamid N, Ayesh A, Thabtah F (2014) Phishing detection based associative classification data mining. Expert Syst Appl 41(13):5948–5959
O ̈zel SA (2011) A web page classification system based on a genetic algorithm using tagged-terms as features. Expert Syst Appl: Int J 38(4)
Nguyen TTS, Lu HY, Lu J (2014) Web-page recommendation based on web usage and domain knowledge. IEEE Trans Knowl Data Eng 26(10):2574–2587
Hern ́andez I, Rivero CR, Ruiz D, Corchuelo R (2014) CALA: an unsupervised URL-based web page classification system. Knowl-Based Syst 57
Belmouhcine A, Benkhalifa M (2016) Implicit links-based techniques to enrich k-nearest neighbors and naive bayes algorithms for web page classification. In Proceedings of the 9th international conference on computer recognition systems CORES 2015. Springer International Publishing
Cui L, Meng F, Shi Y, Li M, Liu A (2014) A hierarchy method based on LDA and SVM for news classification. In 2014 IEEE international conference on data mining workshop, pp 60–64
Chen X, Xia Y, Jin P, Carroll J (2015) Dataless text classification with descriptive LDA. In AAAI’15: Proceedings of the twenty-ninth AAAI conference on artificial intelligence. Leshan Teachers College, AAAI Press, New York
fxsjy. “jieba” chinese text segmentation. https://github.com/fxsjy/jieba, 2012
Chang CC, Lin CJ (2011) A library for support vector machines.http://www.csie.ntu.edu.tw/cjlin/libsvm/, 2001
Acknowledgements
Project supported by the National Natural Science of China undergrant No. 61371177.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wei, Y., Wang, W., Wang, B., Yang, B., Liu, Y. (2018). A Method for Topic Classification of Web Pages Using LDA-SVM Model. In: Deng, Z. (eds) Proceedings of 2017 Chinese Intelligent Automation Conference. CIAC 2017. Lecture Notes in Electrical Engineering, vol 458. Springer, Singapore. https://doi.org/10.1007/978-981-10-6445-6_64
Download citation
DOI: https://doi.org/10.1007/978-981-10-6445-6_64
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6444-9
Online ISBN: 978-981-10-6445-6
eBook Packages: EngineeringEngineering (R0)