Advertisement

Probabilistic Verification for Reliable Network-on-Chip System Design

  • Benjamin LewisEmail author
  • Arnd Hartmanns
  • Prabal Basu
  • Rajesh Jayashankara Shridevi
  • Koushik Chakraborty
  • Sanghamitra Roy
  • Zhen Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11687)

Abstract

The design of modern network-on-chip (NoC) systems faces reliability challenges due to process and environmental variations. Peak power supply noise (PSN) in the power delivery network of a NoC device plays a critical role in determining reliable operations: PSN typically leads to voltage droop, which can cause timing errors in the NoC router pipelines. Existing simulation-based approaches cannot provide rigorous, worst-case reliability guarantees on the probabilistic behaviors of PSN. To address this problem, this paper takes a significant step in formally analyzing PSN in modern NoCs. Specifically, we present a probabilistic model checking approach for the rigorous characterization of PSN for a generic central router of a large mesh-NoC system, under the Round Robin scheduling mechanism with a uniform random network traffic load. Defining features for PSN are extracted at the behavioral level to facilitate property formulation. Several abstract models have been derived for the central router’s concrete model based on the observations of its arbiter’s conflict resolution behavior. Probabilistic modeling and verification are performed using the Modest Toolset. Results show significant scalability of our abstract models, and reveal key PSN characteristics that are indicative of NoC design and optimization.

Keywords

Probabilistic model checking Network-on-chip Reliability analysis Power supply noise 

Notes

Acknowledgments

Arnd Hartmanns was supported by NWO VENI grant 639.021.754. Benjamin Lewis, Prabal Basu, Rajesh Jayashankara Shridevi, Koushik Chakraborty, and Sanghamitra Roy were supported in part by National Science Foundation (NSF) grants CAREER-1253024, CNS-1421022, and CNS-1421068. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.

References

  1. 1.
    Ancajas, D.M., Chakraborty, K., Roy, S., Allred, J.M.: Tackling QoS-induced aging in exascale systems through agile path selection. In: IEEE/ACM International Conference on Hardware/software Codesign and System Synthesis, pp. 1–10 (2014)Google Scholar
  2. 2.
    Basu, P., Shridevi, R.J., Chakraborty, K., Roy, S.: IcoNoClast: tackling voltage noise in the noc power supply through flow-control and routing algorithms. IEEE Trans. VLSI Syst. 25(7), 2035–2044 (2017)CrossRefGoogle Scholar
  3. 3.
    Bhardwaj, K., Chakraborty, K., Roy, S.: An MILP based aging aware routing algorithm for NoCs. In: IEEE/ACM Design Automation & Test in Europe (DATE), pp. 326–331 (2012)Google Scholar
  4. 4.
    Bogdan, P., Marculescu, R.: Hitting time analysis for fault-tolerant communication at nanoscale in future multiprocessor platforms. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. (TCAD) 30(8), 1197–1210 (2011)CrossRefGoogle Scholar
  5. 5.
    Chaix, F., Avresky, D., Zergainoh, N.E., Nicolaidis, M.: A fault-tolerant deadlock-free adaptive routing for on chip interconnects. In: IEEE/ACM Design Automation & Test in Europe (DATE), pp. 909–912 (2011)Google Scholar
  6. 6.
    Chen, T., Forejt, V., Kwiatkowska, M., Parker, D., Simaitis, A.: Automatic verification of competitive stochastic systems. Formal Methods Syst. Des. 43(1), 61–92 (2013)CrossRefGoogle Scholar
  7. 7.
    Chou, C.L., Marculescu, R.: FARM: fault-aware resource management in NoC-based multiprocessor platforms. In: IEEE/ACM Design Automation & Test in Europe (DATE), pp. 673–678 (2011)Google Scholar
  8. 8.
    Coste, N., Hermanns, H., Lantreibecq, E., Serwe, W.: Towards performance prediction of compositional models in industrial GALS designs. In: Bouajjani, A., Maler, O. (eds.) CAV 2009. LNCS, vol. 5643, pp. 204–218. Springer, Heidelberg (2009).  https://doi.org/10.1007/978-3-642-02658-4_18CrossRefGoogle Scholar
  9. 9.
    Dahir, N., Mak, T.S.T., Xia, F., Yakovlev, A.: Modeling and tools for power supply variations analysis in networks-on-chip. IEEE Trans. Comput. (TC) 63(3), 679–690 (2014)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Fang, L., Yamagata, Y., Oiwa, Y.: Evaluation of a resilience embedded system using probabilistic model-checking. arXiv preprint arXiv:1405.1703 (2014)
  11. 11.
    Gay, S., Nagarajan, R., Papanikolaou, N.: Probabilistic model-checking of quantum protocols. arXiv preprint arXiv:quant-ph/0504007 (2005)
  12. 12.
    Hahn, E.M., Hartmanns, A.: A comparison of time- and reward-bounded probabilistic model checking techniques. In: Fränzle, M., Kapur, D., Zhan, N. (eds.) SETTA 2016. LNCS, vol. 9984, pp. 85–100. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-47677-3_6CrossRefzbMATHGoogle Scholar
  13. 13.
    Hahn, E.M., Hartmanns, A., Hermanns, H., Katoen, J.: A compositional modelling and analysis framework for stochastic hybrid systems. Formal Methods Syst. Des. 43(2), 191–232 (2013)CrossRefGoogle Scholar
  14. 14.
    Han, J., Gao, J., Jonker, P., Qi, Y., Fortes, J.A.: Toward hardware-redundant, fault-tolerant logic for nanoelectronics. IEEE Des. Test Comput. 22(4), 328–339 (2005)CrossRefGoogle Scholar
  15. 15.
    Hartmanns, A., Hermanns, H.: The Modest Toolset: an integrated environment for quantitative modelling and verification. In: Ábrahám, E., Havelund, K. (eds.) TACAS 2014. LNCS, vol. 8413, pp. 593–598. Springer, Heidelberg (2014).  https://doi.org/10.1007/978-3-642-54862-8_51CrossRefGoogle Scholar
  16. 16.
    Hartmanns, A., Junges, S., Katoen, J.-P., Quatmann, T.: Multi-cost bounded reachability in MDP. In: Beyer, D., Huisman, M. (eds.) TACAS 2018. LNCS, vol. 10806, pp. 320–339. Springer, Cham (2018).  https://doi.org/10.1007/978-3-319-89963-3_19CrossRefGoogle Scholar
  17. 17.
    Hosseini, A., Ragheb, T., Massoud, Y.: A fault-aware dynamic routing algorithm for on-chip networks. In: IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2653–2656 (2008)Google Scholar
  18. 18.
    Klein, J., et al.: Advances in symbolic probabilistic model checking with PRISM. In: Chechik, M., Raskin, J.-F. (eds.) TACAS 2016. LNCS, vol. 9636, pp. 349–366. Springer, Heidelberg (2016).  https://doi.org/10.1007/978-3-662-49674-9_20CrossRefGoogle Scholar
  19. 19.
    Kumar, J.A., Vasudevan, S.: Automatic compositional reasoning for probabilistic model checking of hardware designs. In: 2010 Seventh International Conference on the Quantitative Evaluation of Systems (QEST), pp. 143–152. IEEE (2010)Google Scholar
  20. 20.
    Kwiatkowska, M., Norman, G., Parker, D.: Probabilistic verification of Herman’s self-stabilisation algorithm. Formal Aspects Comput. 24(4–6), 661–670 (2012)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Kwiatkowska, M., Norman, G., Sproston, J., Wang, F.: Symbolic model checking for probabilistic timed automata. Inf. Comput. 205(7), 1027–1077 (2007)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Milazzo, P.: Formal Modeling in Systems Biology: An Approach from Theoretical Computer Science. VDM Verlag (2008)Google Scholar
  23. 23.
    Mundhenk, P., Steinhorst, S., Lukasiewycz, M., Fahmy, S.A., Chakraborty, S.: Security analysis of automotive architectures using probabilistic model checking. In: Proceedings of the 52nd Annual Design Automation Conference, p. 38. ACM (2015)Google Scholar
  24. 24.
    Norman, G., Parker, D., Kwiatkowska, M., Shukla, S., Gupta, R.: Using probabilistic model checking for dynamic power management. Formal Aspects Comput. 17(2), 160–176 (2005)CrossRefGoogle Scholar
  25. 25.
    Norman, G., Parker, D., Kwiatkowska, M., Shukla, S.K.: Evaluating the reliability of defect-tolerant architectures for nanotechnology with probabilistic model checking. In: Proceedings of the 17th International Conference on VLSI Design, pp. 907–912. IEEE (2004)Google Scholar
  26. 26.
    Salamat, R., Khayambashi, M., Ebrahimi, M., Bagherzadeh, N.: A resilient routing algorithm with formal reliability analysis for partially connected 3D-NoCs. IEEE Trans. Comput. 65(11), 3265–3279 (2016)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Shridevi, R.J., Ancajas, D.M., Chakraborty, K., Roy, S.: Tackling voltage emergencies in NoC through timing error resilience. In: ACM International Symposium on Low Power Electronic Devices (ISLPED), pp. 104–109 (2015)Google Scholar
  28. 28.
    Tsai, W.C., Zheng, D.Y., Chen, S.J., Hu, Y.H.: A fault-tolerant NoC scheme using bidirectional channel. In: IEEE/ACM Design Automation Conference (DAC), pp. 918–923 (2011)Google Scholar
  29. 29.
    Verbeek, F.: Formal verification of on-chip communication fabrics (2013)Google Scholar
  30. 30.
    Welke, S.R., Johnson, B.W., Aylor, J.H.: Reliability modeling of hardware/software systems. IEEE Trans. Reliab. 44(3), 413–418 (1995)CrossRefGoogle Scholar
  31. 31.
    Zhang, Z., Serwe, W., Wu, J., Yoneda, T., Zheng, H., Myers, C.: An improved fault-tolerant routing algorithm for a network-on-chip derived with formal analysis. Sci. Comput. Program. 118, 24–39 (2016)CrossRefGoogle Scholar
  32. 32.
    Zhang, Z., Serwe, W., Wu, J., Yoneda, T., Zheng, H., Myers, C.: Formal analysis of a fault-tolerant routing algorithm for a network-on-chip. In: Lang, F., Flammini, F. (eds.) FMICS 2014. LNCS, vol. 8718, pp. 48–62. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10702-8_4CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Utah State UniversityLoganUSA
  2. 2.University of TwenteEnschedeThe Netherlands

Personalised recommendations