A Novel Data Mining Approach for Analysis and Pattern Recognition of Active Fingerprinting Components

  • Harshit GujralEmail author
  • Sangeeta Mittal
  • Abhinav Sharma


Active fingerprinting is an effective penetration testing technique to know about vulnerability of hosts against security threats and network as a whole. Sometimes firewalls may block fingerprinting packets, hence making the probes infeasible. Measured Round Trip Time (RTTm) is a benign number that can be obtained from communication based on legitimate non malicious packets. In this paper, RTTm has been used along with other timers namely Smoothened Round-trip Time (SRTT), Round-trip Time Variance (RTTVar), Retransmission Time Out (RTO) and Scantime for pattern recognition and association analysis with the aid of cross-correlations. Experimental relationship among these timers are derived to back-up existing theoretical knowledge. A novel method to estimate IP-ID Sequence classes and network-traffic intensity based on these timers has been proposed. Results show that the model can be used to accurately derive (about 100% accuracy) active fingerprinting components IP-ID sequences and link traffic estimation. Analytical results obtained by this study can help in designing high-performance realistic networks and dynamic congestion control techniques.


Active fingerprinting Round-trip time Data mining Correlations Cybersecurity 



  1. 1.
    Edge, C., Barker, W., Hunter, B., & Sullivan, G. (2010). Network scanning, intrusion detection, and intrusion prevention tools. In Enterprise mac security (pp. 485–504). Apress.
  2. 2.
    Aikat, J., Kaur, J., Smith, F. D., & Jeffay, K. (2003). Variability in TCP round-trip times. In Proceedings of the 3rd ACM SIGCOMM conference on internet measurement (pp. 279–284). ACM.
  3. 3.
    Im, S. Y., Shin, S. H., Ryu, K. Y., & Roh, B. H. (2016). Performance evaluation of network scanning tools with operation of firewall. In Ubiquitous and future networks (ICUFN), 2016 eighth international conference on (pp. 876–881). IEEE.
  4. 4.
    Barnett, R. J., & Irwin, B. (2008). Towards a taxonomy of network scanning techniques. In Proceedings of the 2008 annual research conference of the South African institute of computer scientists and information technologists on IT research in developing countries: Riding the wave of technology (pp. 1–7). ACM.
  5. 5.
    Lyon, G. F. (2009). Nmap network scanning: The official Nmap project guide to network discovery and security scanning. Insecure.Google Scholar
  6. 6.
    Beverly, R., & Berger, A. (2015). Server siblings: Identifying shared IPv4/IPv6 infrastructure via active fingerprinting. In J. Mirkovic, & Y. Liu (Eds.), Passive and active measurement. PAM 2015. Lecture Notes in Computer Science (Vol. 8995). Cham: Springer. ​ Scholar
  7. 7.
    Xu, Q., et al. (2016). Device fingerprinting in wireless networks: Challenges and opportunities. IEEE Communications Surveys & Tutorials, 18(1), 94–104. Scholar
  8. 8.
    Jirsík, T., & Čeleda, P. (2014). Identifying operating system using flow-based traffic fingerprinting. In Meeting of the European network of universities and companies in information and communication engineering (Vol. 8846, pp. 70–73). Cham: Springer.​.
  9. 9.
    Ghiëtte, V., Blenn, N., & Doerr, C. (2016). Remote identification of port scan toolchains. In New technologies, mobility and security (NTMS), 2016 8th IFIP international conference on (pp. 1–5). IEEE.
  10. 10.
    Qin, F., Shi, P., Du, J., Cheng, R., & Zhou, Y. (2017). Research on network scanning strategy based on information granularity. In Journal of physics: Conference series (Vol. 910, No. 1, pp. 012001). IOP Publishing.Google Scholar
  11. 11.
    Shamsi, Z., et al. (2016). Hershel: Single-packet OS fingerprinting. IEEE/ACM Transactions on Networking (TON), 24(4), 2196–2209.Google Scholar
  12. 12.
    Jacobson, V. (1988). Congestion avoidance and control. ACM SIGCOMM Computer Communication Review, 18(4), 314–329.Google Scholar
  13. 13.
    Jain, R. (1989). A delay-based approach for congestion avoidance in interconnected heterogeneous computer networks. ACM SIGCOMM Computer Communication Review, 19(5), 56–71.Google Scholar
  14. 14.
    Brakmo, L. S., O’Malley, S. W., & Peterson, L. L. (1994). TCP Vegas: New techniques for congestion detection and avoidance (Vol. 24, No. 4, pp. 24–35). ACM.Google Scholar
  15. 15.
    Wang, Z., & Crowcroft, J. (1991). A new congestion control scheme: Slow start and search (Tri-S). ACM SIGCOMM Computer Communication Review, 21(1), 32–43.Google Scholar
  16. 16.
    Biaz, S., & Vaidya, N. H. (2003). Is the round-trip time correlated with the number of packets in flight? In Proceedings of the 3rd ACM SIGCOMM conference on internet measurement (vol. 278).
  17. 17.
    Padhye, J., Firoiu, V., Towsley, D., & Kurose, J. (1998). Modeling TCP throughput: A simple model and its empirical validation. ACM SIGCOMM Computer Communication Review, 28(4), 303–314.Google Scholar
  18. 18.
    Hengartner, U., Bolliger, J., & Gross, T. (2000). TCP Vegas revisited. In IEEE proceedings of the nineteenth annual joint conference of the IEEE computer and communications societies (INFOCOM 2000) (Vol. 3, pp. 1546–1555). IEEE.Google Scholar
  19. 19.
    Andren, J., Hilding, M., & Veitch, D. (1998). Understanding end-to-end internet traffic dynamics. In IEEE Global telecommunications conference, 1998 (GLOBECOM 1998). The Bridge to Global Integration (Vol. 2, pp. 1118–1122). IEEE.Google Scholar
  20. 20.
    Martin, J., Nilsson, A., & Rhee, I. (2003). Delay-based congestion avoidance for TCP. IEEE/ACM Transactions on Networking, 11(3), 356–369.Google Scholar
  21. 21.
    Martin, J., Nilsson, A., & Rhee, I. (2000). The incremental deployability of RTT-based congestion avoidance for high speed TCP Internet connections. ACM SIGMETRICS Performance Evaluation Review, 28(1), 134–144.Google Scholar
  22. 22.
    Morris, R. J. (1979). Fixing timeout intervals for lost packet detection in computer communication networks. In AFIPS conference proceedings.Google Scholar
  23. 23.
    Velten, D, Hinden, R., & Sax, J. (1984). Reliable data protocol; RFC908. In ARPANET Working Group requests for comments, no. 908. Menlo Park, CA: SRI International.Google Scholar
  24. 24.
    Sanghi, D., Subramaniam, M. C., Shankar, A. U., Gudmundsson, O., & Jalote, P. (1990). A TCP instrumentation and its use in evaluating roundtrip-time estimators (No. UMIACS-TR-90-38). Maryland Univ College Park Inst for Advanced Computer Studies.Google Scholar
  25. 25.
    Postel, J. (1981). Transmission control protocol, RFC 793. Information Sciences Institute, University of Southern California.Google Scholar
  26. 26.
    Karn, P., & Partridge, C. (1987). Improving round-trip time estimates in reliable transport protocols. ACM SIGCOMM Computer Communication Review, 17(5), 2–7. Scholar
  27. 27.
    Mills, D. (1983). Internet delay experiments; RFC889. ARPANET Working Group Requests for Comments (889).Google Scholar
  28. 28.
    Allman, M., & Paxson, V. (1999). On estimating end-to-end network path properties. ACM SIGCOMM Computer Communication Review, 29(4), 263–274.Google Scholar
  29. 29.
    Gujral, H. (2017). (Newtein). GitHub Repository—RTT analysis. Retrieved on December 20, 2017.
  30. 30.
    Lyon, G. (1997). Nmap (Version: 7.01) [Software]. Retrieved on December 20, 2017.
  31. 31.
    Paxson, V., & Allman, M. (2000). RFC 2988, Computing TCP’s retransmission Timer. Google Scholar
  32. 32.
    Paxson, V., Allman, M., Chu, J., & Sargent, M. (2011). RFC 6298, Computing TCP’s retransmission Timer. Google Scholar
  33. 33.
    Allman, M. (2000). A web server’s view of the transport layer. ACM SIGCOMM Computer Communication Review, 30(5), 10–20.Google Scholar
  34. 34.
    Jiang, H., & Dovrolis, C. (2002). Passive estimation of TCP round-trip times. ACM SIGCOMM Computer Communication Review, 32(3), 75–88.Google Scholar
  35. 35.
    Jaiswal, S., Iannaccone, G., Diot, C., Kurose, J., & Towsley, D. (2007). Measurement and classification of out-of-sequence packets in a tier-1 IP backbone. IEEE/ACM Transactions on Networking (ToN), 15(1), 54–66.Google Scholar
  36. 36.
    Prigent, G., Vichot, F., & Harrouet, F. (2010). IpMorph: Fingerprinting spoofing unification. Journal in Computer Virology, 6(4), 329–342.​.Google Scholar
  37. 37.
    Veal, B., Li, K., & Lowenthal, D. (2005). New methods for passive estimation of TCP round-trip times. In International workshop on passive and active network measurement (Vol. 3431, pp. 121–134). Berlin, Heidelberg: Springer. ​
  38. 38.
    Pearson, K. (1895). Note on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58, 240–242. Retrieved from Scholar
  39. 39.
    Farmer, S. F., Halliday, D. M., Conway, B. A., Stephens, J. A., & Rosenberg, J. R. (1997). A review of recent applications of cross-correlation methodologies to human motor unit recording. Journal of Neuroscience Methods, 74(2), 175–187.Google Scholar
  40. 40.
    Bacia, K., Kim, S. A., & Schwille, P. (2006). Fluorescence cross-correlation spectroscopy in living cells. Nature Methods, 3(2), 83.Google Scholar
  41. 41.
    Cliff, A. D., & Ord, K. (1970). Spatial autocorrelation: A review of existing and new measures with applications. Economic Geography, 46(sup1), 269–292.Google Scholar
  42. 42.
    Kohno, T., Broido, A., & Claffy, K. C. (2005). Remote physical device fingerprinting. IEEE Transactions on Dependable and Secure Computing, 2(2), 93–108. Scholar
  43. 43.
    Crotti, M., Dusi, M., Gringoli, F., & Salgarelli, L. (2007). Traffic classification through simple statistical fingerprinting. ACM SIGCOMM Computer Communication Review, 37(1), 5–16. ​ Scholar
  44. 44.
    Spangler, R. (2003). Analysis of remote active operating system fingerprinting tools. Madison: University of Wisconsin.Google Scholar
  45. 45.
    Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3–42. Scholar
  46. 46.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in python. Journal of machine learning research, 12, 2825–2830.MathSciNetzbMATHGoogle Scholar
  47. 47.
    Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Belmont, CA: The Wadsworth and Brook.zbMATHGoogle Scholar
  48. 48.
    Hastie, T., & Tibshirani, R., & Friedman, J. H. (2009). 10. Boosting and Additive Trees. In The elements of statistical learning (2nd ed., pp. 337–384). New York: Springer.Google Scholar
  49. 49.
    Breiman, L., & Cutler, A. (2007). Random forests-classification description (p. 2). Berkeley: Department of Statistics.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Computer Science Engineering and ITJaypee Institute of Information TechnologyNoidaIndia

Personalised recommendations