Reliable Aggregation on Network Traffic for Web Based Knowledge Discovery

Yu, Shui; James, Simon; Tian, Yonghong; Dou, Wanchun

doi:10.1007/978-1-4614-1903-7_8

Reliable Aggregation on Network Traffic for Web Based Knowledge Discovery

Shui Yu⁴,
Simon James⁴,
Yonghong Tian⁵ &
…
Wanchun Dou⁶

Conference paper
First Online: 01 January 2012

528 Accesses

Abstract

The web is a rich resource for information discovery, as a result web mining is a hot topic. However, a reliable mining result depends on the reliability of the data set. For every single second, the web generate huge amount of data, such as web page requests, file transportation. The data reflect human behavior in the cyber space and therefore valuable for our analysis in various disciplines, e.g. social science, network security. How to deposit the data is a challenge. An usual strategy is to save the abstract of the data, such as using aggregation functions to preserve the features of the original data with much smaller space. A key problem, however is that such information can be distorted by the presence of illegitimate traffic, e.g. botnet recruitment scanning, DDoS attack traffic, etc. An important consideration in web related knowledge discovery then is the robustness of the aggregation method, which in turn may be affected by the reliability of network traffic data. In this chapter, we first present the methods of aggregation functions, and then we employe information distances to filter out anomaly data as a preparation for web data mining.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Beliakov, G., A. Pradera, & T. Calvo 2007. Aggregation Functions: A Guide for Practitioners. Springer, Heidelberg, Berlin, New York.
Google Scholar
Cooley, Robert Walker 2000. Web Usage Mining: Discovery and Application of Interestin Patterns from Web Data.
Google Scholar
Cover, Thomas M., & Joy A. Thomas 2006. Elements of Information Theory. John Wiley & Sons.
Google Scholar
Dai, Honghua, & James Liu 2008. Proceedings of the IEEE International Workshop on Reliability Issues of Knowledge discovery. IEEE Computer Society.
Google Scholar
Dai, Honghua, James Liu, & Huan Liu 2006. Proceedings of the IEEE InternationalWorkshop on Reliability Issues of Knowledge discovery. IEEE Computer Society.
Google Scholar
Dai, Honghua, James Liu, & Evgueni Smirnov 2010. Proceedings of the IEEE International Workshop on Reliability Issues of Knowledge discovery. IEEE Computer Society.
Google Scholar
El-Atawy, Adel, Ehab Al-Shaer, Tung Tran, & Raouf Boutaba 2009. Adaptive Early Packet Filtering for Protecting Firewalls against DoS Attacks. In Proceedings of the INFOCOM.
Google Scholar
Grabisch, M., J.-L. Marichal, R. Mesiar, & E. Pap 2009. Aggregation Functions. Cambridge University Press, Cambridge.
MATH Google Scholar
Manavoglu, Eren, Dmitry Pavlov, & C. Lee Giles 2003. Probabilistic User Behavior Models. Data Mining, IEEE International Conference on, 0:203.
Google Scholar
McLachlan, G J 1992. Discriminant analysis and statistical pattern recognition. Wiley-Interscience.
Google Scholar
Moore, David, Colleen Shannon, Douglas J. Brown, Geoffrey M. Voelker, & Stefan Savage 2006. Inferring Internet denial-of-service activity. ACM Transactions on Computer Systems, 24(2):115–139.
Article Google Scholar
Peng, Tao, Christopher Leckie, & Kotagiri Ramamohanarao 2007. Survey of network-based defense mechanisms countering the DoS and DDoS problems. ACM Computing Survey, 39(1).
Google Scholar
Srivastava, Jaideep, Robert Cooley, Mukund Deshpande, & Pang-Ning Tan 2000. Web usage mining: discovery and applications of usage patterns fromWeb data. SIGKDD Explor. Newsl., 1:12–23.
Article Google Scholar
Thing, Vrizlynn L. L., Morris Sloman, & Naranker Dulay 2007. A Survey of Bots Used for Distributed Denial of Service Attacks. In SEC, pages 229–240.
Google Scholar
Torra, V., & Y. Narukawa 2007. Modeling Decisions. Information Fusion and Aggregation Operators. Springer, Berlin, Heidelberg.
Book MATH Google Scholar
Wang, Haining, Cheng Jin, & Kang G. Shin 2007. Defense against spoofed IP traffic using hop-count filtering. IEEE/ACM Transactions on Networking, 15(1):40–53.
Article Google Scholar
Yager, R.R. 1988. On ordered weighted averaging aggregation operators in multicriteria decision making. IEEE Transactions on Systems, Man and Cybernetics, 18:183–190.
Article MathSciNet MATH Google Scholar
Yager, R.R., & D. P. Filev 1999. Induced ordered weighted averaging operators. IEEE Transactions on Systems, Man, and Cybernetics – Part B: Cybernetics, 20(2):141–150.
Google Scholar
Yager, R. R., & G. Beliakov 2010. OWA operators in regression problems. IEEE Transactions on Fuzzy Systems, 18(1):106–113.
Article Google Scholar
Yu, Shui, Robin Doss, & Wanlei Zhou 2008. Information Theory Based Detection Against Network Behavior Mimicking DDoS Attacks. IEEE Communications Letters, 12(4):319–321.
Google Scholar
Yu, Shui, Theerasak Thapngam, Jianwen Liu, Su Wei, & Wanlei Zhou 2009. Discriminating DDoS Flows from Flash Crowds Using Information Distance. In Proceedings of the 3rd International Conference on Network and System Security, pages 351–356.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology, Deakin University, Victoria, Australia
Shui Yu & Simon James
School of Electronic Engineering and Computer Science, Peking University, Beijing, China
Yonghong Tian
Department of Computer Science and Technology, Nanjing University, Nanjing, China
Wanchun Dou

Authors

Shui Yu
View author publications
You can also search for this author in PubMed Google Scholar
Simon James
View author publications
You can also search for this author in PubMed Google Scholar
Yonghong Tian
View author publications
You can also search for this author in PubMed Google Scholar
Wanchun Dou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shui Yu .

Editor information

Editors and Affiliations

, School of Information Technology, Deakin University, 221 Burwood Highway, Burwood, 3125, Victoria, Australia
Honghua Dai
, Computing, Hong Kong Polytechnic University, Man Wai Building, Hunghom, PQ806, Hong Kong SAR
James N. K. Liu
, Department of Knowledge Engineering, Maastricht University, Maastricht, 6200MD, Netherlands
Evgueni Smirnov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, S., James, S., Tian, Y., Dou, W. (2012). Reliable Aggregation on Network Traffic for Web Based Knowledge Discovery. In: Dai, H., Liu, J., Smirnov, E. (eds) Reliable Knowledge Discovery. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-1903-7_8

Download citation

DOI: https://doi.org/10.1007/978-1-4614-1903-7_8
Published: 08 February 2012
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4614-1902-0
Online ISBN: 978-1-4614-1903-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics