Skip to main content

A Method for Topic Classification of Web Pages Using LDA-SVM Model

  • Conference paper
  • First Online:
Proceedings of 2017 Chinese Intelligent Automation Conference (CIAC 2017)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 458))

Included in the following conference series:

Abstract

The fast developments on the computer and networking technologies have made the Internet become the largest medium of information in the word at present. Many companies hope to be able to timely and effective access to information from the Internet. Efficient webpages classification system is needed. According to the classification requirements, we use LDA-SVM model for elaborate web category classification. And we discuss the impact of topic number K in LDA to the classification. The experiments show our method is efficient.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cortes Corinna, Vapnik Vladimir (1995) Support-vector networks. Mach Learn 20(3):273–297

    MATH  Google Scholar 

  2. Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3

    Google Scholar 

  3. Ahmadi A, Fotouhi M, Khaleghi M (2011) Intelligent classification of web pages using contextual and visual features. Appl Soft Comput 11(2)

    Google Scholar 

  4. Abdelhamid N, Ayesh A, Thabtah F (2014) Phishing detection based associative classification data mining. Expert Syst Appl 41(13):5948–5959

    Article  Google Scholar 

  5. O ̈zel SA (2011) A web page classification system based on a genetic algorithm using tagged-terms as features. Expert Syst Appl: Int J 38(4)

    Google Scholar 

  6. Nguyen TTS, Lu HY, Lu J (2014) Web-page recommendation based on web usage and domain knowledge. IEEE Trans Knowl Data Eng 26(10):2574–2587

    Article  Google Scholar 

  7. Hern ́andez I, Rivero CR, Ruiz D, Corchuelo R (2014) CALA: an unsupervised URL-based web page classification system. Knowl-Based Syst 57

    Google Scholar 

  8. Belmouhcine A, Benkhalifa M (2016) Implicit links-based techniques to enrich k-nearest neighbors and naive bayes algorithms for web page classification. In Proceedings of the 9th international conference on computer recognition systems CORES 2015. Springer International Publishing

    Google Scholar 

  9. Cui L, Meng F, Shi Y, Li M, Liu A (2014) A hierarchy method based on LDA and SVM for news classification. In 2014 IEEE international conference on data mining workshop, pp 60–64

    Google Scholar 

  10. Chen X, Xia Y, Jin P, Carroll J (2015) Dataless text classification with descriptive LDA. In AAAI’15: Proceedings of the twenty-ninth AAAI conference on artificial intelligence. Leshan Teachers College, AAAI Press, New York

    Google Scholar 

  11. fxsjy. “jieba” chinese text segmentation. https://github.com/fxsjy/jieba, 2012

  12. Chang CC, Lin CJ (2011) A library for support vector machines.http://www.csie.ntu.edu.tw/cjlin/libsvm/, 2001

Download references

Acknowledgements

Project supported by the National Natural Science of China undergrant No. 61371177.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bailing Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Wei, Y., Wang, W., Wang, B., Yang, B., Liu, Y. (2018). A Method for Topic Classification of Web Pages Using LDA-SVM Model. In: Deng, Z. (eds) Proceedings of 2017 Chinese Intelligent Automation Conference. CIAC 2017. Lecture Notes in Electrical Engineering, vol 458. Springer, Singapore. https://doi.org/10.1007/978-981-10-6445-6_64

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6445-6_64

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6444-9

  • Online ISBN: 978-981-10-6445-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics