Skip to main content

An Efficient Topic-Specific Web Text Filtering Framework

  • Conference paper
Web Technologies Research and Development - APWeb 2005 (APWeb 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3399))

Included in the following conference series:

  • 532 Accesses

Abstract

In this paper, an efficient topic-specific Web text filtering framework is proposed. This framework focuses on blocking some topic-specific Web text content. In this framework, a hybrid feature selection method is proposed, and a high efficient filtering engine is designed. In training, we select features based on CHI statistic and rough set theory, then to construct filter with Vector Space Model. We train our frame with huge datasets, and the result suggests our framework is more effective for the topic-specific text filtering. This framework runs at server such as gateway, and it is more efficient than a client-based system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Lee, P.Y., Hui, S.C., Fong, A.C.M.: Neural Networks for Web Content Filtering. IEEE Intelligent Systems 17, 48–57 (2002)

    Article  Google Scholar 

  • Ding, C., Chi, C.-H., Deng, J., Dong, C.-L.: Centralized Content-Based Web Filtering and Blocking: How Far Can It Go. In: Proceeding of IEEE International Conference on Systems, Man and Cybernetics, pp. 115–119 (1999)

    Google Scholar 

  • Rogati, M., Yang, Y.: High-performing feature selection for text classification. In: CIKM 2002, Virginia, USA, November 2002, pp. 659–661 (2002)

    Google Scholar 

  • Pawlak, Z.: Rough sets. International Journal of Information and computer Science 11(5), 341–356 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  • Chouchoulas, A., Shen, Q.: Rough set-aided keyword reduction for text categorization. Applied Artificial Intelligence 15(9), 843–873 (2001)

    Article  Google Scholar 

  • Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Comm.ACM 18(11), 613–620 (1975)

    Article  MATH  Google Scholar 

  • Pang, J., Bu, D., Bai, S.: Research and Implementation of Text Categorization System Based on VSM. Compute Application Research 9, 23–26 (2001)

    Google Scholar 

  • Fan, J.-J., Su, K.-Y.: An efficient algorithm for matching multiple patterns. IEEE Transactions on Knowledge and Data Engineering 5(2), 339–351 (1993)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, Q., Li, J. (2005). An Efficient Topic-Specific Web Text Filtering Framework. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds) Web Technologies Research and Development - APWeb 2005. APWeb 2005. Lecture Notes in Computer Science, vol 3399. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31849-1_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31849-1_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25207-8

  • Online ISBN: 978-3-540-31849-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics