Advertisement

E-QED: Electrical Bug Localization During Post-silicon Validation Enabled by Quick Error Detection and Formal Methods

  • Eshan SinghEmail author
  • Clark Barrett
  • Subhasish Mitra
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10427)

Abstract

During post-silicon validation, manufactured integrated circuits are extensively tested in actual system environments to detect design bugs. Bug localization involves identification of a bug trace (a sequence of inputs that activates and detects the bug) and a hardware design block where the bug is located. Existing bug localization practices during post-silicon validation are mostly manual and ad hoc, and, hence, extremely expensive and time consuming. This is particularly true for subtle electrical bugs caused by unexpected interactions between a design and its electrical state. We present E-QED, a new approach that automatically localizes electrical bugs during post-silicon validation. Our results on the OpenSPARC T2, an open-source 500-million-transistor multicore chip design, demonstrate the effectiveness and practicality of E-QED: starting with a failed post-silicon test, in a few hours (9 h on average) we can automatically narrow the location of the bug to (the fan-in logic cone of) a handful of candidate flip-flops (18 flip-flops on average for a design with ~1 Million flip-flops) and also obtain the corresponding bug trace. The area impact of E-QED is ~2.5%. In contrast, determining this same information might take weeks (or even months) of mostly manual work using traditional approaches.

References

  1. Abramovici, M., Breuer, M.A., Friedman, A.D.: Digital Systems Testing and Testable Design. Computer Science Press, New York (1990)Google Scholar
  2. Abramovici, M.: A reconfigurable design-for-debug infrastructure for SoCs. In: Proceedings of IEEE/ACM Design Automation Conference, pp. 7–12 (2006)Google Scholar
  3. Anis, E., Nicolici, N.: On using lossless compression of debug data in embedded logic analysis. In: Proceedings of 2007 IEEE International Test Conference (ITC) (2007)Google Scholar
  4. Bardell, P.H., McAnney, W.H., Savir, J.: Built-in test for VLSI: Pseudorandom Techniques. Wiley, New York (1987)Google Scholar
  5. Bayazit, A.A., Malik, S.: Complementary use of runtime validation and model checking. In: Proceedings of ICCAD-2005, IEEE/ACM International Conference on Computer-Aided Design, pp. 1052–1059 (2005)Google Scholar
  6. Cho, H., et al.: Understanding soft errors in uncore components. In: Proceedings of 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC), pp. 1–6 (2015)Google Scholar
  7. Clarke, E., Biere, A., Raimi, R., Zhu, Y.: Bounded model checking using satisfiability solving. Formal Methods Syst. Des. 19(1), 7–34 (2001)CrossRefzbMATHGoogle Scholar
  8. DeOrio, A., Khudia, D.S., Bertacco, V.: Post-silicon bug diagnosis with inconsistent executions. In: Proceedings of 2011 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Jose, CA, pp. 755–761 (2011)Google Scholar
  9. De Paula, F.M., et al.: BackSpace: formal analysis for post-silicon debug. In: Proceedings of International Conference on Formal Methods in Computer-Aided Design, pp. 1–10 (2008)Google Scholar
  10. De Paula, F.M., et al.: TAB-BackSpace: unlimited-length trace buffers with zero additional on-chip overhead. In: Proceedings of IEEE/ACM Design Automation Conference (2011)Google Scholar
  11. De Paula, F.M., Hu, A.J., Nahir, A.: nuTAB-BackSpace: rewriting to normalize non-determinism in post-silicon debug traces. In: Madhusudan, P., Seshia, S.A. (eds.) CAV 2012. LNCS, vol. 7358, pp. 513–531. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-31424-7_37 CrossRefGoogle Scholar
  12. Dusanapudi, M., et al.: Debugging post-silicon fails in the IBM POWER8 bring-up lab. IBM J. Res. Dev. 59(1), 1–10 (2015)CrossRefGoogle Scholar
  13. Foster, H.D.: Trends in functional verification: a 2014 industry study. In: Proceedings of IEEE/ACM Design Automation Conference, pp. 48–52 (2015)Google Scholar
  14. Friedler, O., et al.: Effective post-silicon failure localization using dynamic program slicing. In: Proceedings of IEEE/ACM Design Automation Test in Europe, pp. 1–6 (2014)Google Scholar
  15. Hong, T., et al.: QED: quick error detection tests for effective post-silicon validation. In: Proceedings of IEEE International, Test Conference, pp. 1–10 (2010)Google Scholar
  16. Jones, R.B., Seger, C.-J.H., Dill, D.L.: Self-consistency checking. In: Srivas, M., Camilleri, A. (eds.) FMCAD 1996. LNCS, vol. 1166, pp. 159–171. Springer, Heidelberg (1996). doi: 10.1007/BFb0031806 CrossRefGoogle Scholar
  17. Ko, H.F., Nicolici, N.: Algorithms for state restoration and trace-signal selection for data acquisition in silicon debug. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 28(2), 285–297 (2009)CrossRefGoogle Scholar
  18. Larrabee, T.: Test pattern generation using Boolean satisfiability. IEEE Trans. Comput.-Aided Des. Integr. Circ. Syst. 11(1), 4–15 (1992)CrossRefGoogle Scholar
  19. Le, B., Sengupta, D., Veneris, A., Poulos, Z.: Accelerating post silicon debug of deep electrical faults. In: Proceedings of 2013 IEEE 19th International On-Line Testing Symposium (IOLTS), Chania, pp. 61–66 (2013)Google Scholar
  20. Li, W., Forin, A., Seshia, S.A.: Scalable specification mining for verification and diagnosis. In: Proceedings of Design Automation Conference (DAC), pp. 755–760 (2010)Google Scholar
  21. Lin, D., et al.: Effective post-silicon validation of system-on-chips using quick error detection. IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 33(10), 1573–1590 (2014)CrossRefGoogle Scholar
  22. Lin, D., et al.: A structured approach to post-silicon validation and debug using symbolic quick error detection. In: Proceedings of 2015 IEEE International Test Conference (ITC), October 2015Google Scholar
  23. Ma, S., et al.: Can’t see the forest for the trees: state restoration’s limitations in post-silicon trace signal selection. In: Proceedings of 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 1–8 (2015)Google Scholar
  24. Mangassarian, H., et al.: A performance-driven QBF-based iterative logic array representation with applications to verification, debug and test. In: Proceedings of International Conference on Computer-Aided Design (ICCAD) (2007)Google Scholar
  25. McLaughlin, R., Venkataraman, S., Lim, C.: Automated debug of speed path failures using functional tests. In: Proceedings of 2009 IEEE VLSI Test Symposium, pp. 91–96 (2009)Google Scholar
  26. Mishra, P., Morad, R., Ziv, A., Ray, S.: Post-silicon validation in the SoC era: a tutorial introduction. In: IEEE Design & Test, April 2017Google Scholar
  27. Nahir, A., et al.: Post-silicon validation of the IBM POWER8 processor. In: Proceedings of IEEE/ACM Design Automation Conference, pp. 1–6 (2014)Google Scholar
  28. OpenSPARC: World’s First Free 64-bit Microprocessor. http://www.opensparc.net
  29. Park, S.-B., Hong, T., Mitra, S.: Post-silicon bug localization in processors using instruction footprint recording and analysis (IFRA). IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 28(10), 1545–1558 (2009)CrossRefGoogle Scholar
  30. Park, S.-B., et al.: BLoG: post-silicon bug localization in processors using bug localization graph. In: Proceedings of IEEE/ACM Design Automation Conference, pp. 368–373 (2010)Google Scholar
  31. Reick, K.: Post-silicon debug – DAC workshop on post-silicon debug: technologies, methodologies, and best-practices. In: Proceedings of IEEE/ACM Design Automation Conference (2012)Google Scholar
  32. Sanda, P.N., et al.: Soft-error resilience of the IBM POWER6 processor. IBM J. Res. Dev. 52(3), 275–284 (2008)CrossRefGoogle Scholar
  33. Saxena, N.R., McCluskey, E.J.: Parallel signature analysis design with bounds on aliasing. IEEE Trans. Comput. 46(4), 425–438 (1997)CrossRefGoogle Scholar
  34. Sengupta, D., et al.: Lazy suspect-set computation: fault diagnosis for deep electrical bugs. In: Proceedings of the Great Lakes Symposium on VLSI. ACM (2012)Google Scholar
  35. Singh, E., Barrett, C., Mitra, S.: E-QED: electrical bug localization during post-silicon validation enabled by quick error detection and formal methods, arXiv:1705.0125 [cs.OH] (2017)
  36. Vali, A., Nicolici, N.: Bit-flip detection-driven selection of trace signals. In: Proceedings of 2016 21th IEEE European Test Symposium (ETS), Amsterdam, pp. 1–6 (2016)Google Scholar
  37. Vermeulen, B., Goossens, K.: Debugging Systems-on-Chip: Communication-Centric and Abstraction Based Techniques. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  38. Woo, S.C., et al.: The SPLASH-2 programs: characterization and methodological considerations. In: Proceedings of International Symposium on Computer Architecture (1995)Google Scholar
  39. Zhu, C.S., Weissenbacher, G., Malik, S.: Post-silicon fault localisation using maximum satisfiability and backbones. In: Proceedings of IEEE/ACM Formal Methods Computer-Aided Design, pp. 63–66 (2011)Google Scholar
  40. Zhu, C.S., Weissenbacher, G., Malik, S.: Silicon fault diagnosis using sequence interpolation with backbones. In: Proceedings of 2014 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Jose, CA (2014)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Stanford UniversityStanfordUSA

Personalised recommendations