Is Our Ground-Truth for Traffic Classification Reliable?
- 1.7k Downloads
The validation of the different proposals in the traffic classification literature is a controversial issue. Usually, these works base their results on a ground-truth built from private datasets and labeled by techniques of unknown reliability. This makes the validation and comparison with other solutions an extremely difficult task. This paper aims to be a first step towards addressing the validation and trustworthiness problem of network traffic classifiers. We perform a comparison between 6 well-known DPI-based techniques, which are frequently used in the literature for ground-truth generation. In order to evaluate these tools we have carefully built a labeled dataset of more than 500 000 flows, which contains traffic from popular applications. Our results present PACE, a commercial tool, as the most reliable solution for ground-truth generation. However, among the open-source tools available, NDPI and especially Libprotoident, also achieve very high precision, while other, more frequently used tools (e.g., L7-filter) are not reliable enough and should not be used for ground-truth generation in their current form.
KeywordsVirtual Machine Traffic Classification False Positive Ratio Deep Packet Inspection Private Dataset
Unable to display preview. Download preview PDF.
- 3.Fukuda, K.: Difficulties of identifying application type in backbone traffic. In: Int. Conf. on Network and Service Management (CNSM), pp. 358–361. IEEE (2010)Google Scholar
- 5.Alcock, S., et al.: Libprotoident: Traffic Classification Using Lightweight Packet Inspection. Technical report, University of Waikato (2012)Google Scholar
- 7.Dainotti, A., et al.: Identification of traffic flows hiding behind TCP port 80. In: IEEE Int. Conf. on Communications (ICC), pp. 1–6 (2010)Google Scholar
- 8.Karagiannis, T., et al.: Transport layer identification of P2P traffic. In: 4th ACM Internet Measurement Conf. (IMC), pp. 121–134 (2004)Google Scholar
- 9.Shen, C., et al.: On detection accuracy of L7-filter and OpenDPI. In: 3rd Int. Conf. on Networking and Distributed Computing (ICNDC), pp. 119–123. IEEE (2012)Google Scholar
- 10.Alcock, S., Nelson, R.: Measuring the Accuracy of Open-Source Payload-Based Traffic Classifiers Using Popular Internet Applications. In: IEEE Workshop on Network Measurements (2013)Google Scholar
- 12.[Online]: Traffic classification at the Universitat Politècnica de Catalunya, UPC (2013), http://monitoring.ccaba.upc.edu/traffic_classification
- 13.Bujlow, T., et al.: Volunteer-Based System for classification of traffic in computer networks. In: 19th Telecommunications Forum TELFOR, pp. 210–213. IEEE (2011)Google Scholar
- 14.[Online]: Volunteer-Based System for Research on the Internet (2012), http://vbsi.sourceforge.net/
- 15.Bujlow, T., et al.: Comparison of Deep Packet Inspection (DPI) Tools for Traffic Classification. Technical report, UPC BarcelonaTech (2013)Google Scholar