Abstract
Online testing is critical to ensuring reliable operations of the next generation of supercomputers based on a kilo-core network-on-chip (NoC) interconnection fabric. We present a parallel software-based self-testing (SBST) solution that makes use of the bounded model checking (BMC) technique to generate test sequences and parallel packets. In this method, the parallel SBST with BMC derives the leading sequence for each router’s internal function and detects all functionally- testable faults related to the function. A Monte-Carlo simulation algorithm is then used to search for the approximately optimum configuration of the parallel packets, which guarantees the test quality and minimizes the test cost. Finally, a multi-threading technology is used to ensure that the Monte-Carlo simulation can reach the approximately optimum configuration in a large random space and reduce the generating time of the parallel test. Experimental results show that the proposed method achieves a high fault coverage with a reduced test overhead. Moreover, by performing online testing in the functional mode with SBST, it effectively avoids the over-testing problem caused by functionally untestable turns in kilo-core NoCs.
References
Zhang Y, Hong X P, Chen Z S, Peng Z B, Jiang J H. A deterministic-path routing algorithm for tolerating many faults on very-large-scale network-on-chip. ACM Trans. Design Automation of Electronic Systems, 2021, 26(1): Article No. 8. https://doi.org/10.1145/3414060.
Zhang Y, Chakrabarty K, Li H W, Jiang J H. Softwarebased online self-testing of network-on-chip using bounded model checking. In Proc. the 2017 IEEE International Test Conference, Oct. 31–Nov. 2, 2017. 10.1109/TEST.2017.8242037.
Dongarra J. Report on the Sunway TaihuLight system. Tech Report UT-EECS-16-742, Oak Ridge National Laboratory, 2016. https://netlib.org/utk/people/JackDongarra/PAPERS/sunway-report-2016-old.pdf, Mar. 2023.
Zhang L, Han Y H, Xu Q, Li X W, Li H W. On topology reconfiguration for defect-tolerant NoC-based homogeneous manycore systems. IEEE Trans. Very Large Scale Integration (VLSI) Systems, 2009, 17(9): 1173–1186. https://doi.org/10.1109/TVLSI.2008.2002108.
Ouyang Y M, Da J, Wang X M, Han Q Q, Liang H G, Du G M. A TSV fault-tolerant scheme based on failure classification in 3D-NoC. Journal of Circuits, Systems and Computers, 2017, 26(4): 1750059. https://doi.org/10.1142/S0218126617500591.
Chen Z S, Zhang Y, Peng Z B, Jiang J H. A deterministic-path routing algorithm for tolerating many faults on wafer-level NoC. In Proc. the 2019 Design, Automation and Test in Europe Conference & Exhibition, Mar. 2019, pp.1337–1342. https://doi.org/10.23919/DATE.2019.8714948.
Lee D, Das S, Doppa J R, Pande P P, Chakrabarty K. Performance and thermal tradeoffs for energy-efficient monolithic 3D network-on-chip. ACM Trans. Design Automation of Electronic Systems, 2018, 23(5): Article No. 60. https://doi.org/10.1145/3223046.
Liu W C, Yang L, Jiang W W, Feng L, Guan N, Zhang W, Dutt N. Thermal-aware task mapping on dynamically reconfigurable network-on-chip based multiprocessor system-on-chip. IEEE Trans. Computers, 2018, 67(12): 1818–1834. https://doi.org/10.1109/TC.2018.2844365.
Xiang D, Chakrabarty K, Fujiwara H. Multicast-based testing and thermal-aware test scheduling for 3D ICs with a stacked network-on-chip. IEEE Trans. Computers, 2016, 65(9): 2767–2779. https://doi.org/10.1109/TC.2015.2493548.
Wang L, Wang X H, Leung H F, Mak T. A non-minimal routing algorithm for aging mitigation in 2D-mesh NoCs. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 2019, 38(7): 1373–1377. https://doi.org/10.1109/TCAD.2018.2855149.
Xiang D, Shen K L, Bhattacharya B B, Wen X Q, Lin X J. Thermal-aware small-delay defect testing in integrated circuits for mitigating overkill. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 2016, 35(3): 499–512. https://doi.org/10.1109/TCAD.2015.2474365.
Biere A, Cimatti A, Clarke E M, Strichman O, Zhu Y S. Bounded model checking. Advances in Computers, 2003, 58: 117–148. https://doi.org/10.1016/S0065-2458(03)58003-2.
Clarke E, Biere A, Raimi R, Zhu Y S. Bounded model checking using satisfiability solving. Formal Methods in System Design, 2001, 19(1): 7–34. https://doi.org/10.1023/A:1011276507260.
Cota E, Liu C. Constraint-driven test scheduling for NoCbased systems. IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, 2006, 25(11): 2465–2478. https://doi.org/10.1109/TCAD.2006.881331.
Yuan F, Huang L, Xu Q. Re-examining the use of network-on-chip as test access mechanism. In Proc. the 2008 Conference on Design, Automation and Test in Europe, Mar. 2018, pp.808–811. https://doi.org/10.1145/1403375.1403571.
Richter M, Chakrabarty K. Optimization of test pincount, test scheduling, and test access for NoC-based multicore SoCs. IEEE Trans. Computers, 2014, 63(3): 691–702. https://doi.org/10.1109/TC.2013.82.
Strano A, Gómez C, Ludovici D, Favalli M, Gómez M E, Bertozzi D. Exploiting network-on-chip structural redundancy for a cooperative and scalable built-in self-test architecture. In Proc. the 2011 Design, Automation and Test in Europe, Mar. 2011. https://doi.org/10.1109/DATE.2011.5763109.
Wang J S, Ebrahimi M, Huang L T, Xie X, Li Q, Li G J, Jantsch A. Efficient design-for-test approach for networks-on-chip. IEEE Trans. Computers, 2019, 68(2): 198–213. https://doi.org/10.1109/TC.2018.2865948.
Herve M B, Cota E, Kastensmidt F L, Lubaszewski M. NoC interconnection functional testing: Using boundaryscan to reduce the overall testing time. In Proc. the 10th Latin American Test Workshop, Mar. 2009. https://doi.org/10.1109/LATW.2009.4813801.
Kakoee M R, Bertacco V, Benini L. At-speed distributed functional testing to detect logic and delay faults in NoCs. IEEE Trans. Computers, 2014, 63(3): 703–717. https://doi.org/10.1109/TC.2013.202.
Kranitis N, Merentitis A, Theodorou G, Paschalis A, Gizopoulos D. Hybrid-SBST methodology for efficient testing of processor cores. IEEE Design & Test of Computers, 2008, 25(1): 64–75. https://doi.org/10.1109/MDT.2008.15.
Collet J H, Zajac P, Psarakis M, Gizopoulos D. Chip selforganization and fault tolerance in massively defective multicore arrays. IEEE Trans. Dependable and Secure Computing, 2011, 8(2): 207–217. https://doi.org/10.1109/TDSC.2009.53.
Dalirsani A, Imhof M E, Wunderlich H J. Structural software-based self-test of network-on-chip. In Proc. the 32nd VLSI Test Symposium, Apr. 2014. https://doi.org/10.1109/VTS.2014.6818754.
Cheng K T, Krishnakumar A S. Automatic functional test generation using the extended finite state machine model. In Proc. the 30th International Design Automation Conference, June 1993, pp.86–91. https://doi.org/10.1145/157485.164585.
Psarakis M, Gizopoulos D, Sanchez E, Reorda M S. Microprocessor software-based self-testing. IEEE Design & Test of Computers, 2010, 27(3): 4–19. https://doi.org/10.1109/MDT.2010.5.
Ganai M K, Gupta A. Accelerating high-level bounded model checking. In Proc. the 2006 International Conference on Computer Aided Design, Nov. 2006, pp.794–801. https://doi.org/10.1109/ICCAD.2006.320122.
Tupuri R S, Abraham J A. A novel functional test generation method for processors using commercial ATPG. In Proc. the 1997 International Test Conference, Nov. 1997, pp.743–752. https://doi.org/10.1109/TEST.1997.639687.
Zhang Y, Li H W, Li X W. Automatic test program generation using executing-trace-based constraint extraction for embedded processors. IEEE Trans. Very Large Scale Integration (VLSI) Systems, 2013, 27(7): 1220–1233. https://doi.org/10.1109/TVLSI.2012.2208130.
Negro V C, Goldstein L H. The generation of correlated multivariate samples for Monte Carlo simulation. IEEE Spectrum, 1968, 5(2): 5. https://doi.org/10.1109/MSPEC.1968.5214753.
Author information
Authors and Affiliations
Corresponding author
Supplementary Information
ESM 1
(PDF 145 kb)
Rights and permissions
About this article
Cite this article
Zhang, Y., Ji, PF., Zhu, PW. et al. Parallel Software-Based Self-Testing with Bounded Model Checking for Kilo-Core Networks-on-Chip. J. Comput. Sci. Technol. 38, 405–421 (2023). https://doi.org/10.1007/s11390-022-2553-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-022-2553-3