Abstract
The web is a rich resource for information discovery, as a result web mining is a hot topic. However, a reliable mining result depends on the reliability of the data set. For every single second, the web generate huge amount of data, such as web page requests, file transportation. The data reflect human behavior in the cyber space and therefore valuable for our analysis in various disciplines, e.g. social science, network security. How to deposit the data is a challenge. An usual strategy is to save the abstract of the data, such as using aggregation functions to preserve the features of the original data with much smaller space. A key problem, however is that such information can be distorted by the presence of illegitimate traffic, e.g. botnet recruitment scanning, DDoS attack traffic, etc. An important consideration in web related knowledge discovery then is the robustness of the aggregation method, which in turn may be affected by the reliability of network traffic data. In this chapter, we first present the methods of aggregation functions, and then we employe information distances to filter out anomaly data as a preparation for web data mining.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Beliakov, G., A. Pradera, & T. Calvo 2007. Aggregation Functions: A Guide for Practitioners. Springer, Heidelberg, Berlin, New York.
Cooley, Robert Walker 2000. Web Usage Mining: Discovery and Application of Interestin Patterns from Web Data.
Cover, Thomas M., & Joy A. Thomas 2006. Elements of Information Theory. John Wiley & Sons.
Dai, Honghua, & James Liu 2008. Proceedings of the IEEE International Workshop on Reliability Issues of Knowledge discovery. IEEE Computer Society.
Dai, Honghua, James Liu, & Huan Liu 2006. Proceedings of the IEEE InternationalWorkshop on Reliability Issues of Knowledge discovery. IEEE Computer Society.
Dai, Honghua, James Liu, & Evgueni Smirnov 2010. Proceedings of the IEEE International Workshop on Reliability Issues of Knowledge discovery. IEEE Computer Society.
El-Atawy, Adel, Ehab Al-Shaer, Tung Tran, & Raouf Boutaba 2009. Adaptive Early Packet Filtering for Protecting Firewalls against DoS Attacks. In Proceedings of the INFOCOM.
Grabisch, M., J.-L. Marichal, R. Mesiar, & E. Pap 2009. Aggregation Functions. Cambridge University Press, Cambridge.
Manavoglu, Eren, Dmitry Pavlov, & C. Lee Giles 2003. Probabilistic User Behavior Models. Data Mining, IEEE International Conference on, 0:203.
McLachlan, G J 1992. Discriminant analysis and statistical pattern recognition. Wiley-Interscience.
Moore, David, Colleen Shannon, Douglas J. Brown, Geoffrey M. Voelker, & Stefan Savage 2006. Inferring Internet denial-of-service activity. ACM Transactions on Computer Systems, 24(2):115–139.
Peng, Tao, Christopher Leckie, & Kotagiri Ramamohanarao 2007. Survey of network-based defense mechanisms countering the DoS and DDoS problems. ACM Computing Survey, 39(1).
Srivastava, Jaideep, Robert Cooley, Mukund Deshpande, & Pang-Ning Tan 2000. Web usage mining: discovery and applications of usage patterns fromWeb data. SIGKDD Explor. Newsl., 1:12–23.
Thing, Vrizlynn L. L., Morris Sloman, & Naranker Dulay 2007. A Survey of Bots Used for Distributed Denial of Service Attacks. In SEC, pages 229–240.
Torra, V., & Y. Narukawa 2007. Modeling Decisions. Information Fusion and Aggregation Operators. Springer, Berlin, Heidelberg.
Wang, Haining, Cheng Jin, & Kang G. Shin 2007. Defense against spoofed IP traffic using hop-count filtering. IEEE/ACM Transactions on Networking, 15(1):40–53.
Yager, R.R. 1988. On ordered weighted averaging aggregation operators in multicriteria decision making. IEEE Transactions on Systems, Man and Cybernetics, 18:183–190.
Yager, R.R., & D. P. Filev 1999. Induced ordered weighted averaging operators. IEEE Transactions on Systems, Man, and Cybernetics – Part B: Cybernetics, 20(2):141–150.
Yager, R. R., & G. Beliakov 2010. OWA operators in regression problems. IEEE Transactions on Fuzzy Systems, 18(1):106–113.
Yu, Shui, Robin Doss, & Wanlei Zhou 2008. Information Theory Based Detection Against Network Behavior Mimicking DDoS Attacks. IEEE Communications Letters, 12(4):319–321.
Yu, Shui, Theerasak Thapngam, Jianwen Liu, Su Wei, & Wanlei Zhou 2009. Discriminating DDoS Flows from Flash Crowds Using Information Distance. In Proceedings of the 3rd International Conference on Network and System Security, pages 351–356.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this paper
Cite this paper
Yu, S., James, S., Tian, Y., Dou, W. (2012). Reliable Aggregation on Network Traffic for Web Based Knowledge Discovery. In: Dai, H., Liu, J., Smirnov, E. (eds) Reliable Knowledge Discovery. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-1903-7_8
Download citation
DOI: https://doi.org/10.1007/978-1-4614-1903-7_8
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4614-1902-0
Online ISBN: 978-1-4614-1903-7
eBook Packages: Computer ScienceComputer Science (R0)