NetStore: An Efficient Storage Infrastructure for Network Forensics and Monitoring
Abstract
With the increasing sophistication of attacks, there is a need for network security monitoring systems that store and examine very large amounts of historical network flow data. An efficient storage infrastructure should provide both high insertion rates and fast data access. Traditional row-oriented Relational Database Management Systems (RDBMS) provide satisfactory query performance for network flow data collected only over a period of several hours. In many cases, such as the detection of sophisticated coordinated attacks, it is crucial to query days, weeks or even months worth of disk resident historical data rapidly. For such monitoring and forensics queries, row oriented databases become I/O bound due to long disk access times. Furthermore, their data insertion rate is proportional to the number of indexes used, and query processing time is increased when it is necessary to load unused attributes along with the used ones. To overcome these problems we propose a new column oriented storage infrastructure for network flow records, called NetStore. NetStore is aware of network data semantics and access patterns, and benefits from the simple column oriented layout without the need to meet general purpose RDBMS requirements. The prototype implementation of NetStore can potentially achieve more than ten times query speedup and ninety times less storage size compared to traditional row-stores, while it performs better than existing open source column-stores for network flow data.
Keywords
Compression Method Query Performance Segment Size Index Node Insertion RatePreview
Unable to display preview. Download preview PDF.
References
- 1.Abadi, D., Madden, S., Ferreira, M.: Integrating compression and execution in column-oriented database systems. In: SIGMOD 2006: Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pp. 671–682. ACM, New York (2006)CrossRefGoogle Scholar
- 2.Abadi, D.J., Madden, S.R., Hachem, N.: Column-stores vs. row-stores: how different are they really? In: SIGMOD 2008: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 967–980. ACM, New York (2008)CrossRefGoogle Scholar
- 3.Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A distributed storage system for structured data. In: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2006 (2006)Google Scholar
- 4.Cranor, C., Johnson, T., Spataschek, O., Shkapenyuk, V.: Gigascope: a stream database for network applications. In: SIGMOD 2003: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, pp. 647–651. ACM, New York (2003)CrossRefGoogle Scholar
- 5.Gates, C., Collins, M., Duggan, M., Kompanek, A., Thomas, M.: More netflow tools for performance and security. In: LISA 2004: Proceedings of the 18th USENIX Conference on System Administration, pp. 121–132. USENIX Association, Berkeley (2004)Google Scholar
- 6.Geambasu, R., Bragin, T., Jung, J., Balazinska, M.: On-demand view materialization and indexing for network forensic analysis. In: NETB 2007: Proceedings of the 3rd USENIX International Workshop on Networking Meets Databases, pp. 1–7. USENIX Association, Berkeley (2007)Google Scholar
- 7.Goldstein, J., Ramakrishnan, R., Shaft, U.: Compressing relations and indexes. In: Proceedings of IEEE International Conference on Data Engineering, pp. 370–379 (1998)Google Scholar
- 8.Halverson, A., Beckmann, J.L., Naughton, J.F., Dewitt, D.J.: A comparison of c-store and row-store in a common framework. Technical Report TR1570, University of Wisconsin-Madison (2006)Google Scholar
- 9.Holloway, A.L., DeWitt, D.J.: Read-optimized databases, in depth. Proc. VLDB Endow. 1(1), 502–513 (2008)Google Scholar
- 10.Infobright Inc. Infobright, http://www.infobright.com
- 11.LucidEra. Luciddb, http://www.luciddb.org
- 12.Paxson, V.: Bro: A system for detecting network intruders in real-time. Computer Networks, 2435–2463 (1998)Google Scholar
- 13.PostgreSQL. Postgresql, http://www.postgresql.org
- 14.Roesch, M.: Snort - lightweight intrusion detection for networks. In: LISA 1999: Proceedings of the 13th USENIX Conference on System Administration, pp. 229–238. USENIX Association, Berkeley (1999)Google Scholar
- 15.Ślȩzak, D., Wróblewski, J., Eastwood, V., Synak, P.: Brighthouse: an analytic data warehouse for ad-hoc queries. Proc. VLDB Endow. 1(2), 1337–1345 (2008)Google Scholar
- 16.Stonebraker, M., Abadi, D.J., Batkin, A., Chen, X., Cherniack, M., Ferreira, M., Lau, E., Lin, A., Madden, S., O’Neil, E., O’Neil, P., Rasin, A., Tran, N., Zdonik, S.: C-store: a column-oriented dbms. In: VLDB 2005: Proceedings of the 31st International Conference on Very Large Data Bases, VLDB Endowment, pp. 553–564 (2005)Google Scholar
- 17.Sullivan, M., Heybey, A.: Tribeca: A system for managing large databases of network traffic. In: USENIX, pp. 13–24 (1998)Google Scholar
- 18.Cisco Systems. Cisco ios netflow, http://www.cisco.com
- 19.Vertica Systems. Vertica, http://www.vertica.com
- 20.Ziv, J., Lempel, A.: A universal algorithm for sequential data compression. IEEE Transactions on Information Theory 23, 337–343 (1977)zbMATHCrossRefMathSciNetGoogle Scholar
- 21.Zukowski, M., Boncz, P.A., Nes, N., Héman, S.: Monetdb/x100 - a dbms in the cpu cache. IEEE Data Eng. Bull. 28(2), 17–22 (2005)Google Scholar