Abstract
With the increasing use of autonomous and mission critical systems, in field concurrent testing (i.e. testing while device is in operation) is becoming progressively more important. However, concurrent testing of analog and mixed-signal systems often needs to rely on anomaly indicators, because of the absence of clear 1/0 error conditions in analog circuits. In this paper we show how simultaneous monitoring of different but related anomalies across a mixed-signal system can lead to improved diagnosis of error conditions and rules out false errors. However, to achieve such kind of federated learning between different modules across the system, we need an architecture through which all modules within the system can communicate their anomaly information with minimum overhead. To that end, first, we develop a data structure (Weighted Alarm Collector - WAC) to efficiently encapsulate all the anomaly information in a system. Second, we show how WACs can be merged through hierarchies in a scalable manner. Based on this, we also develop an architecture to facilitate learning across different modules across hierarchies. The main benefits of this architecture are that a) it enables distributed error learning within a system and b) the architecture is scalable and can be built bottom up, which is compatible with current IP based modular design methodology. Finally we show some applications of this architecture on some common industry use-cases.
Similar content being viewed by others
References
Baleani M, Ferrari A, Mangeruca L, Sangiovanni-Vincentelli A, Peri M, Brianza A, Pezzini S (2003) Fault-tolerant platforms for automotive safety-critical applications. Proc International conference on Compilers, architecture and synthesis for embedded systems, pp. 170–177
Banerjee G, Behera M, Zeidan M, Chen R, Barnett K (2011) Analog/RF built-in-self-test subsystem for a Mobile broadcast video receiver in 65-nm CMOS. IEEE Journal on Solid State Circuits 46(9):1998–2008
Berwanger J et al (2001) FlexRay–the communication system for advanced automotive control systems. In: SAE 2001 World Congress, Detroit
Bloom B (1970) Space/time trade-offs in hash coding with allowable errors. Communications ACM 13(7):422–426
Bolchini C, Montandon R, Salice F, Sciuto D (1995) Self-checking FSMs based on a constant distance state encoding. Proc. International Workshop on Defect and Fault Tolerance in VLSI, pp. 269–277
Brendan H, Moore E, Ramage D, Blaise A (2016) Federated learning of deep networks using model averaging, ARXIV
Broder A, Mitzenmacher M (2003) Network applications of Bloom filters: a survey. Proc Internet Math 1(4):485–509
Bruck J, Gao J, Jiang A (2006) Weighted Bloom filter. In: Proc IEEE International Symposium on Information, Pages 2304–2308
Cohen S, Matias Y (2003) Spectral Bloom filters. Proc ACM SIGMOD, Madison, pp. 241–252
Fan L, Cao P, Almeida J, Broder AZ (2000) Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Transactions on Networking 8:281–293
Farrington N, Andreyev A (2013) Facebook’s Data Center Network Architecture. Proc. IEEE Optical Interconnects Conference 2013, Santa Fe, pp. 49–50
Geurkov V, Kirischian L (2011) A Concurrent Testing Technique for analog-to-digital converters. In: IEEE 17th international mixed-signals, sensors and systems test workshop, pages 133–136
Hoyme K, Driscoll K (1992) SAFEbusTM. In: 11th AIAA/IEEE digital avionics systems conference, pages 68–73, Seattle
ISO26262: Road vehicles - Functional Safety, International Organization for Standardization, 2011 (1st version), 2018 (2nd version)
Jha NK, Wang S-J (1991) Design and synthesis of self-checking VLSI circuits and systems. Proc IEEE International Conference on Computer Design: VLSI in Computers and Processors, Pages 578–581
Kopetz H (2001) A comparison of TTP/C and FlexRay. Institut fur Technische Informatik, Technische Universitat Wien, Austria, research report, vol. 10, pp. 1–22
Li Y, Makar S, Mitra S (2008) CASP: Concurrent autonomous Chip self-test using stored test patterns. Proc. Design, Automation and Test in Europe, pp. 885–890
Li B., Muller P., Warnock J., Sigal L. and Badami D. (2015) A case study of electromigration reliability: from design point to system operations, Proc IEEE International Reliability Physics Symposium, Pages 2D.1.1 - 2D.1.6
Lu Y, Prabhakar B, Bonomi F (2004) The Bloomier filter: an efficient data structure for static support lookup tables. In: Proc. ACM-SIAM, New Orleans, Louisiana, USA, pages 30–39
Luo L, Guo D, Ma R, Rottenstreich O, Luo X (2019) Optimizing Bloom filter: challenges, solutions, and comparisons. Cornell University
Mahmood A, McCluskey EJ (1998) Concurrent error detection using watchdog processors-a survey. IEEE Transactions on Computers 37(2):160–174
Makris BYI, Orailoglu A (2004) Enhancing reliability of RTL controller-datapath circuits via invariant-based concurrent test. IEEE Transactions on Reliability 53(2):269–278
Matsumoto Y, Hazeyama H, Kadobayashi Y (2008) Adaptive Bloom filter: a space-efficient counting algorithm for unpredictable network traffic. Proc. IEICE Trans Inf Syst 91(5):1292–1299
Miner PS (2000) Analysis of the SPIDER fault-tolerance protocols. In: Fifth NASA Langley Formal Methods Workshop, Hampton
Pagh A, Pagh R, Rao S (2005) An optimal Bloom filter replacement. In: Proc. sixteenth annual ACM-SIAM symposium on discrete algorithms, Philadelphia, pp. 823–829
Roy A, Zeng H, Bagga J, Snoeren A (2017) Passive realtime datacenter fault detection and localization. Proc. 14th USENIX Conference on Networked Systems Design and Implementation, pp. 595-612
Rushby J (2001) A Comparison of Bus Architectures for Safety-Critical Embedded Systems, CSL technical systems, SRI International
Song H, Hao F, Kodialam MS, Lakshman TV (2009) IPv6 lookups using distributed and load balanced Bloom filters for 100Gbps core router line cards. Proc IEEE INFOCOM, 2518–2526
Su F, Goteti P (2018) Improving analog functional safety using data-driven anomaly detection. Proc. International Test Conference, 1–10
Uemura T, Lee J, Park S, Pae S, Lee H (2016) Investigation of logic circuit soft error rate (SER) in 14nm FinFET technology. Proc IEEE International Reliability Physics Symposium (IRPS), Pages 3B-4-1 - 3B-4-4
Xiao B, Hua Y (2010) Using parallel Bloom filters for multiattribute representation on network services. IEEE Transactions on Parallel and Distributed Systems 12(1):20–32
Xuan X, Chatterjee A (2001) Sensitivity and reliability evaluation for mixed-signal ICs under electro-migration and hot-carrier effects, Proc IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, Pages 323–328
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible Editor: M. Sachdev
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kundu, R., Su, F. & Goteti, P. A Distributed Error and Anomaly Communication Architecture for Analog and Mixed-Signal Systems. J Electron Test 35, 317–334 (2019). https://doi.org/10.1007/s10836-019-05795-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10836-019-05795-y