Skip to main content

Reliable Aggregation on Network Traffic for Web Based Knowledge Discovery

  • Conference paper
  • First Online:
  • 528 Accesses

Abstract

The web is a rich resource for information discovery, as a result web mining is a hot topic. However, a reliable mining result depends on the reliability of the data set. For every single second, the web generate huge amount of data, such as web page requests, file transportation. The data reflect human behavior in the cyber space and therefore valuable for our analysis in various disciplines, e.g. social science, network security. How to deposit the data is a challenge. An usual strategy is to save the abstract of the data, such as using aggregation functions to preserve the features of the original data with much smaller space. A key problem, however is that such information can be distorted by the presence of illegitimate traffic, e.g. botnet recruitment scanning, DDoS attack traffic, etc. An important consideration in web related knowledge discovery then is the robustness of the aggregation method, which in turn may be affected by the reliability of network traffic data. In this chapter, we first present the methods of aggregation functions, and then we employe information distances to filter out anomaly data as a preparation for web data mining.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beliakov, G., A. Pradera, & T. Calvo 2007. Aggregation Functions: A Guide for Practitioners. Springer, Heidelberg, Berlin, New York.

    Google Scholar 

  2. Cooley, Robert Walker 2000. Web Usage Mining: Discovery and Application of Interestin Patterns from Web Data.

    Google Scholar 

  3. Cover, Thomas M., & Joy A. Thomas 2006. Elements of Information Theory. John Wiley & Sons.

    Google Scholar 

  4. Dai, Honghua, & James Liu 2008. Proceedings of the IEEE International Workshop on Reliability Issues of Knowledge discovery. IEEE Computer Society.

    Google Scholar 

  5. Dai, Honghua, James Liu, & Huan Liu 2006. Proceedings of the IEEE InternationalWorkshop on Reliability Issues of Knowledge discovery. IEEE Computer Society.

    Google Scholar 

  6. Dai, Honghua, James Liu, & Evgueni Smirnov 2010. Proceedings of the IEEE International Workshop on Reliability Issues of Knowledge discovery. IEEE Computer Society.

    Google Scholar 

  7. El-Atawy, Adel, Ehab Al-Shaer, Tung Tran, & Raouf Boutaba 2009. Adaptive Early Packet Filtering for Protecting Firewalls against DoS Attacks. In Proceedings of the INFOCOM.

    Google Scholar 

  8. Grabisch, M., J.-L. Marichal, R. Mesiar, & E. Pap 2009. Aggregation Functions. Cambridge University Press, Cambridge.

    MATH  Google Scholar 

  9. Manavoglu, Eren, Dmitry Pavlov, & C. Lee Giles 2003. Probabilistic User Behavior Models. Data Mining, IEEE International Conference on, 0:203.

    Google Scholar 

  10. McLachlan, G J 1992. Discriminant analysis and statistical pattern recognition. Wiley-Interscience.

    Google Scholar 

  11. Moore, David, Colleen Shannon, Douglas J. Brown, Geoffrey M. Voelker, & Stefan Savage 2006. Inferring Internet denial-of-service activity. ACM Transactions on Computer Systems, 24(2):115–139.

    Article  Google Scholar 

  12. Peng, Tao, Christopher Leckie, & Kotagiri Ramamohanarao 2007. Survey of network-based defense mechanisms countering the DoS and DDoS problems. ACM Computing Survey, 39(1).

    Google Scholar 

  13. Srivastava, Jaideep, Robert Cooley, Mukund Deshpande, & Pang-Ning Tan 2000. Web usage mining: discovery and applications of usage patterns fromWeb data. SIGKDD Explor. Newsl., 1:12–23.

    Article  Google Scholar 

  14. Thing, Vrizlynn L. L., Morris Sloman, & Naranker Dulay 2007. A Survey of Bots Used for Distributed Denial of Service Attacks. In SEC, pages 229–240.

    Google Scholar 

  15. Torra, V., & Y. Narukawa 2007. Modeling Decisions. Information Fusion and Aggregation Operators. Springer, Berlin, Heidelberg.

    Book  MATH  Google Scholar 

  16. Wang, Haining, Cheng Jin, & Kang G. Shin 2007. Defense against spoofed IP traffic using hop-count filtering. IEEE/ACM Transactions on Networking, 15(1):40–53.

    Article  Google Scholar 

  17. Yager, R.R. 1988. On ordered weighted averaging aggregation operators in multicriteria decision making. IEEE Transactions on Systems, Man and Cybernetics, 18:183–190.

    Article  MathSciNet  MATH  Google Scholar 

  18. Yager, R.R., & D. P. Filev 1999. Induced ordered weighted averaging operators. IEEE Transactions on Systems, Man, and Cybernetics – Part B: Cybernetics, 20(2):141–150.

    Google Scholar 

  19. Yager, R. R., & G. Beliakov 2010. OWA operators in regression problems. IEEE Transactions on Fuzzy Systems, 18(1):106–113.

    Article  Google Scholar 

  20. Yu, Shui, Robin Doss, & Wanlei Zhou 2008. Information Theory Based Detection Against Network Behavior Mimicking DDoS Attacks. IEEE Communications Letters, 12(4):319–321.

    Google Scholar 

  21. Yu, Shui, Theerasak Thapngam, Jianwen Liu, Su Wei, & Wanlei Zhou 2009. Discriminating DDoS Flows from Flash Crowds Using Information Distance. In Proceedings of the 3rd International Conference on Network and System Security, pages 351–356.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shui Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this paper

Cite this paper

Yu, S., James, S., Tian, Y., Dou, W. (2012). Reliable Aggregation on Network Traffic for Web Based Knowledge Discovery. In: Dai, H., Liu, J., Smirnov, E. (eds) Reliable Knowledge Discovery. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-1903-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-1903-7_8

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4614-1902-0

  • Online ISBN: 978-1-4614-1903-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics