Abstract
Packet analysis is very important in our digital life. But what protocol analyzers can do is limited because they can only process data in determined format. This paper puts forward a solution to decode raw data in an unknown format. It is certain that data can be cut into packets because there are usually characteristic bit sequences in packet headers. The key to solve the problem is how to find out those characteristic sequences. We present an efficient way of bit sequence enumeration. Both Aho-Corasick (AC) algorithm and data mining method are used to reduce the cost of the process.
Similar content being viewed by others
References
Hume A, Sunday D. Fast string searching [J]. Software: Practice and Experience, 1991, 21(11), 1221–1248.
Knuth D E, Pratt V R. Fast pattern matching in strings [J]. SIAM Journal on Computing, 1977, 6(2): 323–350.
Aho A V, Corasick M J. Efficient string matching: An aid to bibliographic search [J]. Communications of the ACM, 1975, 18(6): 333–340.
Han J, Pei J, Yin Y, et al. Mining frequent patterns without candidate generation [J]. Data Mining and Knowledge Discovery, 2004, 8(1): 53–87.
Agrawal R, Mielinsk T, Swami A. Mining association rules between sets of items in large databases [C]// Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data. New York, USA: ACM Press, 1993: 207–216.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Qiu, Wd., Jin, L., Yang, Xn. et al. Bit stream oriented enumeration tree pruning algorithm. J. Shanghai Jiaotong Univ. (Sci.) 16, 567–570 (2011). https://doi.org/10.1007/s12204-011-1190-8
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12204-011-1190-8