Zen and the Art of Network Troubleshooting: A Hands on Experimental Study

  • François Espinet
  • Diana JoumblattEmail author
  • Dario Rossi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9053)


Growing network complexity necessitates tools and methodologies to automate network troubleshooting. In this paper, we follow a crowd-sourcing trend, and argue for the need to deploy measurement probes at end-user devices and gateways, which can be under the control of the users or the ISP.

Depending on the amount of information available to the probes (e.g., ISP topology), we formalize the network troubleshooting task as either a clustering or a classification problem, that we solve with an algorithm that (i) achieves perfect classification under the assumption of a strategic selection of probes (e.g., assisted by an ISP) and (ii) operates blindly with respect to the network performance metrics, of which we consider delay and bandwidth in this paper.

While previous work on network troubleshooting privileges a more theoretical vs practical approaches, our workflow balances both aspects as (i) we conduct a set of controlled experiments with a rigorous and reproducible methodology, (ii) on an emulator that we thoroughly calibrate, (iii) contrasting experimental results affected by real-world noise with expected results from a probabilistic model.


Rand Index Selection Policy Delay Measurement Faulty Link Strategic Selection 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
  2. 2.
  3. 3.
    Bahl, P., Chandra, R., Greenberg, A., Kandula, S., Maltz, D.A., Zhang, M.: Towards highly reliable enterprise network services via inference of multi-level dependencies. In: Proc. ACM SIGCOMM (2007)Google Scholar
  4. 4.
    Bischof, Z., Otto, J., Sánchez, M., Rula, J., Choffnes, D., Bustamante, F.: Crowdsourcing ISP characterization to the network edge. In: Proc. SIGCOMM WMUST (2011)Google Scholar
  5. 5.
    Dhamdhere, A., Teixeira, R., Dovrolis, C., Diot, C.: Netdiagnoser: troubleshooting network unreachabilities using end-to-end probes and routing data. In: Proc. CoNEXT (2007)Google Scholar
  6. 6.
    Dhawan, M., Samuel, J., Teixeira, R., Kreibich, C., Allman, M., Weaver, N., Paxson, V.: Fathom: A browser-based network measurement platform. In: Proc. ACM IMC (2012)Google Scholar
  7. 7.
    Duffield, N.G., Horowitz, J., Lo Presti, F., Towsley, D.: Multicast topology inference from measured end-to-end loss. IEEE Transactions on Information Theory (2002)Google Scholar
  8. 8.
    Duffield, N.G., Presti, F.L., Paxson, V., Towsley, D.F.: Network loss tomography using striped unicast probes. IEEE/ACM Trans. Netw. (2006)Google Scholar
  9. 9.
    Ghita, D., Karakus, C., Argyraki, K.J., Thiran, P.: Shifting network tomography toward a practical goal. In: Proc. CoNEXT (2011)Google Scholar
  10. 10.
    Goldoni, E., Rossi, G., Torelli, A.: Assolo, a new method for available bandwidth estimation. In: ICIMP (2009)Google Scholar
  11. 11.
    Goldoni, E., Schivi, M.: End-to-End available bandwidth estimation tools, an experimental comparison. In: Ricciato, F., Mellia, M., Biersack, E. (eds.) TMA 2010. LNCS, vol. 6003, pp. 171–182. Springer, Heidelberg (2010) CrossRefGoogle Scholar
  12. 12.
    Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. Journal of Intelligent Information Systems 17(2–3), 107–145 (2001)CrossRefzbMATHGoogle Scholar
  13. 13.
    Handigol, N., Heller, B., Jeyakumar, V., Lantz, B., McKeown, N.: Reproducible network experiments using container-based emulation. In: Proc. CoNEXT (2012)Google Scholar
  14. 14.
    Hu, N., Steenkiste, P.: Evaluation and characterization of available bandwidth probing techniques. IEEE J. Selected Areas in Communications (2003)Google Scholar
  15. 15.
    Huang, Y., Feamster, N., Teixeira, R.: Practical issues with using network tomography for fault diagnosis. ACM SIGCOMM Computer Communication Review (2008)Google Scholar
  16. 16.
    Joumblatt, D., Teixeira, R., Chandrashekar, J., Taft, N.: HostView: annotating end-host performance measurements with user feedback. In: ACM HotMetrics Workshop (2010)Google Scholar
  17. 17.
    Kim, K., Nam, H., Singh, V.K., Song, D., Schulzrinne, H.: DYSWIS: crowdsourcing a home network diagnosis. In: ICCCN (2014)Google Scholar
  18. 18.
    Kompella, R., Yates, J., Greenberg, A., Snoeren, A.: Detection and localization of network black holes. In: Proc. IEEE INFOCOM (2007)Google Scholar
  19. 19.
    Kreibich, C., Weaver, N., Nechaev, B., Paxson, V.: Netalyzr: Illuminating the edge network. In Proc. ACM IMC (2010)Google Scholar
  20. 20.
    Navratil, J., Cottrell, R.L.: Abwe: a practical approach to available bandwidth estimation. In: Proc. of PAM (2003)Google Scholar
  21. 21.
    Nguyen, H.X., Thiran, P.: The boolean solution to the congested IP link location problem: Theory and practice. In: Proc. IEEE INFOCOM (2007)Google Scholar
  22. 22.
    Paxson, V.: Keynote: reflections on measurement research: crooked lines, straight lines, and moneyshots. In: Proc. ACM SIGCOMM (2011)Google Scholar
  23. 23.
    Presti, F.L., Duffield, N.G., Horowitz, J., Towsley, D.F.: Multicast-based inference of network-internal delay distributions. IEEE/ACM Trans. Netw. (2002)Google Scholar
  24. 24.
    Seedorf, J., Burger, E.: Application-Layer Traffic Optimization (ALTO) Problem Statement. IETF RFC 5693 (2009)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2015

Authors and Affiliations

  • François Espinet
    • 1
  • Diana Joumblatt
    • 2
    Email author
  • Dario Rossi
    • 1
    • 2
  1. 1.Ecole PolytechniqueParisFrance
  2. 2.Telecom ParisTechParisFrance

Personalised recommendations