Abstract
This paper presents a fault-tolerant ALU (“FT-EALU”) based on time redundancy and reward/punishment-based learning approaches for real-time embedded systems that face limitations in hardware and power consumption budgets. In this method, operations are diversified to three versions in order to correct permanent faults along with the transient ones. The diversities of versions considered in FT-EALU are provided by lightweight modifications to differentiate them and clear the effect of permanent faults. Selecting lightweight modifications such as shift and swap would avoid high timing overhead in computation while providing significant differences which are necessary for fault detection. Next, the replicated versions are executed serially in time, and their corresponding results are voted based on the derived learned weights. The proposed weighted voting module generates the final output based on the results and their weights. In the proposed weighted voting module, a reward/punishment strategy is employed to provide the weight of each version of execution indicating its effectiveness in the final output. To this aim, in the method defined for each version of execution, a weight is defined according to its correction capability confronting several faulty scenarios. Thus, this weight defines the reliability of the temporal results as well as their effect on the final result. The final result is generated bit by bit based on the weight of each version of execution and its computed result. Based on the proposed learning scheme, positive or negative weights are assigned to execution versions. These weights are derived in bit level based on the capability of execution versions in mitigating permanent faults in several fault injection scenarios. Thus, our proposed method is low cost and more efficient compared to related research which are mainly based on information and hardware redundancy due to employing time redundancy and static learning approach in correcting permanent faults. Several experiments are performed to reveal the efficiency of our proposed approach based on which FT-EALU is capable of correcting about \(84.93\%\) and \(69.71\%\) of permanent injected faults on single and double bits of input data.
Similar content being viewed by others
Data Availability
Please contact the corresponding author for data requests.
References
Wang J (2017) Real-time embedded systems. John Wiley & Sons, Hoboken
Li Q, Yao C (2003) Real-time concepts for embedded systems. CRC Press, Florida
Veeravalli VS (2009) Fault tolerance for arithmetic and logic unit. In: IEEE Southeastcon 2009, IEEE, pp. 329–334
Fazeli M, Namazi A, Miremadi S-G, Haghdoost A (2011) Operand width aware hardware reuse: a low cost fault-tolerant approach to alu design in embedded processors. Microelectron Reliab 51(12):2374–2387
Hong S, Kim S (2015) A low-cost mechanism exploiting narrow-width values for tolerating hard faults in alu. IEEE Trans Comput 64(9):2433–2446. https://doi.org/10.1109/TC.2014.2366743
Xia Y, Guo S, Hao J, Liu D, Xu J (2021) Error detection of arithmetic expressions. J Supercomput 77(6):5492–5509
Majumdar A, Nayyar S, Sengar JS (2012) Fault tolerant alu system. In: 2012 International Conference on Computing Sciences, IEEE, pp 255–260
Gracia-Morán J, Saiz Adalid LJ, Baraza Calvo JC, Gil Tomás D, Gil Vicente PJ (2021) Design, implementation and evaluation of a low redundant error correction code. IEEE Latin America Trans 19(11):1903–1911
Tay TF, Chang C-H (2017) Fault-tolerant computing in redundant residue number system. Embedded systems design with special arithmetic and number systems. Springer International Publishing, Heidelberg, pp 65–88
Nelson VP (1990) Fault-tolerant computing: fundamental concepts. Computer 23(7):19–25
Abdi A, Souzani A, Amirfakhri M, Moghadam AB (2012) Using security metrics in software quality assurance process. In: 6th International Symposium on Telecommunications (IST), IEEE, pp 1099–1102
Abdi A, Zarandi HR (2018) Hystery: a hybrid scheduling and mapping approach to optimize temperature, energy consumption and lifetime reliability of heterogeneous multiprocessor systems. J Supercomput 74(5):2213–2238
Álvarez I, Proenza J, Barranco M (2018) Mixing time and spatial redundancy over time sensitive networking. In: DSN Workshops, pp 63–64
Valinataj M, Mohammadnezhad A, Nurmi J (2018) A low-cost high-speed self-checking carry select adder with multiple-fault detection. Microelectron J 81:16–27
Towhidy Gol A, Omidi R, Mohammadi K (2020) Fault tolerant alu designing based on new implementation of berger code. TABRIZ J ELECTR ENG 50(2):633–644
Acharya GP, Rani MA (2018) Berger code based concurrent online self-testing of embedded processors. J Semiconductors 39(11):115001
Barredo Ferreira A, Cebrián González JM, Valero Cortés M, Casas Guix M, Moreto Planas M (2020) Efficiency analysis of modern vector architectures: vector alu sizes, core counts and clock frequencies. J supercomput 76:1960–1979
Pahuja S, Kaur G (2021) Design of parity preserving arithmetic and logic unit using reversible logic gates. In: 2021 International Conference on Intelligent Technologies (CONIT), IEEE, pp 1–9
Santos DA, Luza LM, Dilillo L, Zeferino CA, Melo DR (2021) Reliability analysis of a fault-tolerant risc-v system-on-chip. Microelectron Reliability 125:114346
Gade MSL, Rooban S (2020) Run time fault tolerant mechanism for transient and hardware faults in alu for highly reliable embedded processor. In: 2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE), IEEE, pp 44–49
Nicolaidis M (2003) Carry checking/parity prediction adders and alus. IEEE Trans Very Large Scale Integr (VLSI) Syst 11(1):121–128. https://doi.org/10.1109/TVLSI.2002.800526
Towhidy A, Omidi R, Mohammadi K (2019) An efficient current mode mvl residue code checker for fault-tolerant arithmetic. J Circuits Syst Comput 28(14):1950244
Thakral S, Bansal D (2020) Novel high functionality fault tolerant alu. Telkomnika 18(1):234–239
Patel Fung (1982) Concurrent error detection in alu’s by recomputing with shifted operands. IEEE Trans Comput C 31(7):589–595. https://doi.org/10.1109/TC.1982.1676055
Hana HH, Johnson BW (1986) Concurrent error detection in vlsi circuits using time redundancy. In: Proc. IEEE Southeastcon, Vol. 86, pp 23–25
Johnson B, Aylor J, Hana H (1988) Efficient use of time and hardware redundancy for concurrent error detection in a 32-bit vlsi adder. IEEE J Solid-State Circuits 23(1):208–215. https://doi.org/10.1109/4.281
Shukla S, Ray KC (2019) Design and asic implementation of a reconfigurable fault-tolerant alu for space applications. In: IEEE International Symposium on Smart Electronic Systems (iSES)(Formerly iNiS). IEEE 2019:156–159
Ahmad U, Ali S, Ahmed R, Qadri MY, Saif H (2021) Fault-tolerant reconfigurable 32-bit alu for space applications. In: 2021 1st International Conference on Microwave, Antennas & Circuits (ICMAC), IEEE, 2021, pp. 1–4
Dubrova E (2013) Fault-tolerant design. Springer, Heidelberg
Shahoveisi S, Abdi A (2021) E-reso: An enhanced time redundancy-based error detection approach for arithmetic operations. In: Proc. IEEE Iranian Conference on Electrical Engineering(ICEE)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Abdi, A., Shahoveisi, S. FT-EALU: fault-tolerant arithmetic and logic unit for critical embedded and real-time systems. J Supercomput 79, 626–649 (2023). https://doi.org/10.1007/s11227-022-04698-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-022-04698-8