Telecommunication Systems

, Volume 52, Issue 2, pp 847–860

An analysis of interdomain availability and causes of failures based on active measurements

  • Eugene S. Myakotnykh
  • Otto J. Wittner
  • Bjarne E. Helvik
  • Atef Abdelkefi
  • Jon Kåre Hellan
  • Olav Kvittem
  • Trond Skjesol
  • Arne Øslebø
Article

DOI: 10.1007/s11235-011-9586-1

Cite this article as:
Myakotnykh, E.S., Wittner, O.J., Helvik, B.E. et al. Telecommun Syst (2013) 52: 847. doi:10.1007/s11235-011-9586-1

Abstract

With the objective to better understand how the global Internet should achieve an availability in the order of “five nines”, i.e. be available 0.99999 of the time, active measurements were performed between Norway and China through the Global Research Network. End-to-end downtime statistics were collected during two 3-month periods, mid November 2009 till mid February 2010 and July 2010 till September 2010. Probe packets were sent every 10 ms between the two measurement systems supplemented by traceroute measurements every two minutes. The collected data (TTL, timestamps, sequence numbers and traceroute output) enabled identification and characterization of IP-level paths between the end-points. Causes of observed network failures were identified and insight is gained into processes preceding and following communication downtimes. We distinguish inter- and intradomain failures and, when possible, identify an exact link or an Autonomous System where a certain event has happened. The study shows that the end-to-end path availability is mainly affected by interdomain failures and long BGP convergence time as well as series of events not straight forwardly explained by the anticipated (re)routing behavior.

Keywords

Dependability Failure analysis Failure detection Network measurements Quality of Services Routing 

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Eugene S. Myakotnykh
    • 1
  • Otto J. Wittner
    • 2
  • Bjarne E. Helvik
    • 3
  • Atef Abdelkefi
    • 3
  • Jon Kåre Hellan
    • 2
  • Olav Kvittem
    • 4
  • Trond Skjesol
    • 2
  • Arne Øslebø
    • 2
  1. 1.Network and Systems DevelopmentSibcomOsloNorway
  2. 2.IoU departmentUNINETTTrondheimNorway
  3. 3.Centre for Quantifiable Quality of Service in Communication Systems, Centre of ExcellenceNTNUTrondheimNorway
  4. 4.CTO IoU departmentUNINETTTrondheimNorway