Skip to main content
Log in

FT-EALU: fault-tolerant arithmetic and logic unit for critical embedded and real-time systems

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

This paper presents a fault-tolerant ALU (“FT-EALU”) based on time redundancy and reward/punishment-based learning approaches for real-time embedded systems that face limitations in hardware and power consumption budgets. In this method, operations are diversified to three versions in order to correct permanent faults along with the transient ones. The diversities of versions considered in FT-EALU are provided by lightweight modifications to differentiate them and clear the effect of permanent faults. Selecting lightweight modifications such as shift and swap would avoid high timing overhead in computation while providing significant differences which are necessary for fault detection. Next, the replicated versions are executed serially in time, and their corresponding results are voted based on the derived learned weights. The proposed weighted voting module generates the final output based on the results and their weights. In the proposed weighted voting module, a reward/punishment strategy is employed to provide the weight of each version of execution indicating its effectiveness in the final output. To this aim, in the method defined for each version of execution, a weight is defined according to its correction capability confronting several faulty scenarios. Thus, this weight defines the reliability of the temporal results as well as their effect on the final result. The final result is generated bit by bit based on the weight of each version of execution and its computed result. Based on the proposed learning scheme, positive or negative weights are assigned to execution versions. These weights are derived in bit level based on the capability of execution versions in mitigating permanent faults in several fault injection scenarios. Thus, our proposed method is low cost and more efficient compared to related research which are mainly based on information and hardware redundancy due to employing time redundancy and static learning approach in correcting permanent faults. Several experiments are performed to reveal the efficiency of our proposed approach based on which FT-EALU is capable of correcting about \(84.93\%\) and \(69.71\%\) of permanent injected faults on single and double bits of input data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data Availability

Please contact the corresponding author for data requests.

References

  1. Wang J (2017) Real-time embedded systems. John Wiley & Sons, Hoboken

    Book  MATH  Google Scholar 

  2. Li Q, Yao C (2003) Real-time concepts for embedded systems. CRC Press, Florida

    Book  Google Scholar 

  3. Veeravalli VS (2009) Fault tolerance for arithmetic and logic unit. In: IEEE Southeastcon 2009, IEEE, pp. 329–334

  4. Fazeli M, Namazi A, Miremadi S-G, Haghdoost A (2011) Operand width aware hardware reuse: a low cost fault-tolerant approach to alu design in embedded processors. Microelectron Reliab 51(12):2374–2387

    Article  Google Scholar 

  5. Hong S, Kim S (2015) A low-cost mechanism exploiting narrow-width values for tolerating hard faults in alu. IEEE Trans Comput 64(9):2433–2446. https://doi.org/10.1109/TC.2014.2366743

    Article  MathSciNet  MATH  Google Scholar 

  6. Xia Y, Guo S, Hao J, Liu D, Xu J (2021) Error detection of arithmetic expressions. J Supercomput 77(6):5492–5509

    Article  Google Scholar 

  7. Majumdar A, Nayyar S, Sengar JS (2012) Fault tolerant alu system. In: 2012 International Conference on Computing Sciences, IEEE, pp 255–260

  8. Gracia-Morán J, Saiz Adalid LJ, Baraza Calvo JC, Gil Tomás D, Gil Vicente PJ (2021) Design, implementation and evaluation of a low redundant error correction code. IEEE Latin America Trans 19(11):1903–1911

    Article  Google Scholar 

  9. Tay TF, Chang C-H (2017) Fault-tolerant computing in redundant residue number system. Embedded systems design with special arithmetic and number systems. Springer International Publishing, Heidelberg, pp 65–88

    Chapter  Google Scholar 

  10. Nelson VP (1990) Fault-tolerant computing: fundamental concepts. Computer 23(7):19–25

    Article  Google Scholar 

  11. Abdi A, Souzani A, Amirfakhri M, Moghadam AB (2012) Using security metrics in software quality assurance process. In: 6th International Symposium on Telecommunications (IST), IEEE, pp 1099–1102

  12. Abdi A, Zarandi HR (2018) Hystery: a hybrid scheduling and mapping approach to optimize temperature, energy consumption and lifetime reliability of heterogeneous multiprocessor systems. J Supercomput 74(5):2213–2238

    Article  Google Scholar 

  13. Álvarez I, Proenza J, Barranco M (2018) Mixing time and spatial redundancy over time sensitive networking. In: DSN Workshops, pp 63–64

  14. Valinataj M, Mohammadnezhad A, Nurmi J (2018) A low-cost high-speed self-checking carry select adder with multiple-fault detection. Microelectron J 81:16–27

    Article  Google Scholar 

  15. Towhidy Gol A, Omidi R, Mohammadi K (2020) Fault tolerant alu designing based on new implementation of berger code. TABRIZ J ELECTR ENG 50(2):633–644

    Google Scholar 

  16. Acharya GP, Rani MA (2018) Berger code based concurrent online self-testing of embedded processors. J Semiconductors 39(11):115001

    Article  Google Scholar 

  17. Barredo Ferreira A, Cebrián González JM, Valero Cortés M, Casas Guix M, Moreto Planas M (2020) Efficiency analysis of modern vector architectures: vector alu sizes, core counts and clock frequencies. J supercomput 76:1960–1979

    Article  Google Scholar 

  18. Pahuja S, Kaur G (2021) Design of parity preserving arithmetic and logic unit using reversible logic gates. In: 2021 International Conference on Intelligent Technologies (CONIT), IEEE, pp 1–9

  19. Santos DA, Luza LM, Dilillo L, Zeferino CA, Melo DR (2021) Reliability analysis of a fault-tolerant risc-v system-on-chip. Microelectron Reliability 125:114346

    Article  Google Scholar 

  20. Gade MSL, Rooban S (2020) Run time fault tolerant mechanism for transient and hardware faults in alu for highly reliable embedded processor. In: 2020 International Conference on Smart Technologies in Computing, Electrical and Electronics (ICSTCEE), IEEE, pp 44–49

  21. Nicolaidis M (2003) Carry checking/parity prediction adders and alus. IEEE Trans Very Large Scale Integr (VLSI) Syst 11(1):121–128. https://doi.org/10.1109/TVLSI.2002.800526

    Article  Google Scholar 

  22. Towhidy A, Omidi R, Mohammadi K (2019) An efficient current mode mvl residue code checker for fault-tolerant arithmetic. J Circuits Syst Comput 28(14):1950244

    Article  Google Scholar 

  23. Thakral S, Bansal D (2020) Novel high functionality fault tolerant alu. Telkomnika 18(1):234–239

    Article  Google Scholar 

  24. Patel Fung (1982) Concurrent error detection in alu’s by recomputing with shifted operands. IEEE Trans Comput C 31(7):589–595. https://doi.org/10.1109/TC.1982.1676055

    Article  MATH  Google Scholar 

  25. Hana HH, Johnson BW (1986) Concurrent error detection in vlsi circuits using time redundancy. In: Proc. IEEE Southeastcon, Vol. 86, pp 23–25

  26. Johnson B, Aylor J, Hana H (1988) Efficient use of time and hardware redundancy for concurrent error detection in a 32-bit vlsi adder. IEEE J Solid-State Circuits 23(1):208–215. https://doi.org/10.1109/4.281

    Article  Google Scholar 

  27. Shukla S, Ray KC (2019) Design and asic implementation of a reconfigurable fault-tolerant alu for space applications. In: IEEE International Symposium on Smart Electronic Systems (iSES)(Formerly iNiS). IEEE 2019:156–159

  28. Ahmad U, Ali S, Ahmed R, Qadri MY, Saif H (2021) Fault-tolerant reconfigurable 32-bit alu for space applications. In: 2021 1st International Conference on Microwave, Antennas & Circuits (ICMAC), IEEE, 2021, pp. 1–4

  29. Dubrova E (2013) Fault-tolerant design. Springer, Heidelberg

    Book  MATH  Google Scholar 

  30. Shahoveisi S, Abdi A (2021) E-reso: An enhanced time redundancy-based error detection approach for arithmetic operations. In: Proc. IEEE Iranian Conference on Electrical Engineering(ICEE)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Athena Abdi.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abdi, A., Shahoveisi, S. FT-EALU: fault-tolerant arithmetic and logic unit for critical embedded and real-time systems. J Supercomput 79, 626–649 (2023). https://doi.org/10.1007/s11227-022-04698-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04698-8

Keywords

Navigation