1 Introduction

Conventional integrated circuit technology, which is based on complementary metal-oxide semiconductor (CMOS) transistors, represents binary information as current and voltage levels. With the continuous scaling down of transistor feature sizes at the nanoscale, problematic issues associated with the performance of devices using this technology have become apparent, such as high resistances, charge quantization, inadequate switch levels, leakage currents, and heat dissipation [1,2,3].

Heat dissipation is one of the most critical issues in the construction of electronic devices. Conventional computation technologies mainly rely on irreversible operations. For instance, the AND gate transforms two input bits into a single output bit; thus, it loses one bit of information. In 1961, Rolf Landauer asserted that irreversible computational methods cause information loss as a form of heat dissipation of kBTln2 per bit erased, with kB being the Boltzmann constant and T being the system temperature [4]. The energy dissipation caused by these information losses was considered negligible for many years, although it is always much larger than the Landauer bound [5]. The miniaturization of computational devices and the improvement in the material and fabrication processes have led to the transformation of the energy dissipation levels of modern circuits and systems to values close to the Landauer bound. Thus, different methods of computation that can perform logic operations without losing information are required to continue reducing the energy consumption beyond the Landauer energy limit. Reversible computing is an emerging computational paradigm to overcome the heat dissipation problem. In reversible operations, the computation circuits utilize reversible logic gates where the numbers of input and output pins are the same.

In 1973, Bennett proved that circuit energy dissipation is theoretically eliminated if computational operations are performed unaccompanied by information loss [6]. Therefore, to overcome the limitations of energy dissipation, computation operations must be carried out reversibly [7]. In other words, the technologies that conduct reversible operations could eventually allow for ultraefficient computing. However, DeBenedictis [8] stressed that reversible computing is an effective low-power technique only if reversibility is sustained to the physical level. A nanoscale scheme that can overcome the limitations of current CMOS-based very-large-scale integration (VLSI) technology is quantum-dot cellular automata (QCA) [9]. QCA is a suitable transistor-less paradigm nanotechnology for performing digital logic operations that are both logically and physically reversible, allowing for the realization of ultralow energy dissipation computing.

QCA promises very low power dissipation, an extreme density structure, and a very high-speed operation frequency for digital circuits at the nanoscale [9]. In 1993, Lent et al. [10] proposed a physical implementation of digital computation using quantum-dot cells. QCA employs field-coupled nanotechnology (FCN), which stores information as the electron orientation polarity within quantum-dot cells and can be propagated to neighbouring cells using electrostatic interactions [9]. The specified circuit layout and electrostatic interactions among adjacent cells permit logic function implementation, and numerous studies are investigating QCA as a future computing technology [11,12,13]. The basic units in the QCA circuits are QCA cells. Each cell consists of four quantum dots placed at the corners of a square. There are two free electrons within each cell; these electrons can tunnel between quantum dots to represent two binary configurations. The two electrons tend to be located at opposing diagonals due to their electrostatic interactions. As illustrated in Fig. 1, the QCA cells can be in one of two states: cell polarizations P =  − 1 or + 1, which represent binary information of 0 and 1, respectively.

Fig. 1
figure 1

QCA cell polarization

The layout of the cells allows the QCA technology to implement specific logic gates. The fundamental QCA logic gate is the majority gate, which can produce AND and OR gates. Figure 2 illustrates the QCA wire, inverter, and majority gate in the system.

Fig. 2
figure 2

a QCA wire, b QCA inverter, and c QCA majority gate

The QCA circuit design process involves two major stages: developing the logical level (schematic) and the physical level (physical layout). At the logical level, the circuit design is composed of structural and behavioural descriptions. The structural description comes first, describing and generating a netlist of the circuit. The behavioural description follows, describing the design as a set of input–output relations. Finally, simulation is performed to validate the circuit synthesis. The physical level represents the layout of the QCA circuit, i.e., the pins’ locations, gate placement, and routing.

Logical Reversibility refers to a netlist with an equivalent number of input and output pins [6]. However, this does not guarantee that the utilized QCA internal logic gates, composed of majority gates, are reversible with the same input and output pins. Alternatively, Physical Reversibility refers to reversibility being sustained due to the physical layout—a physical representation of the logical level design. Physical reversibility guarantees that every logic gate building block of the circuit has an equivalent number of input and output pins. Consequently, there is both no information loss and no associated energy loss to the environment. Reversible computing is an effective low-power technique only if reversibility is sustained down to the physical layout level [8].

Recently, numerous studies on reversible QCA flip-flop designs have been conducted [14,15,16,17,18,19]. However, these studies have addressed reversibility only at the logical level and have not treated information loss at the physical layout level. These studies used either the well-known logically reversible gates, such as the 3 × 3 Fredkin gate or the 2 × 2 Feynman gate, or newly suggested logically reversible gates to design their reversible flip-flops. However, the equal number of input and output pins in these circuits’ netlist is insufficient to make the circuit physically reversible and achieve energy consumption lower than the Landauer limit. This is because the internal majority gates that make up these digital logic circuits are not reversible, i.e., the number of input and output pins for each internal majority gate is not equal.

In 2019, Torres et al. [20] designed and simulated a logically and physically reversible QCA half adder circuit for the first time using QCADesigner-E software—a QCA circuit implementation and simulation platform pioneered by Torres et al. [21]. QCADesigner-E calculated the energy dissipation values of the QCA half adder circuit and confirmed that the logically and physically reversible combinational QCA circuits could be operated with near zero energy dissipation, i.e., values lower than kBTln2 per operation. However, the possibilities of implementing a logically and physically reversible design technique to design sequential QCA circuits, characterized by feedback loops, need to be further examined.

Optimizing computing circuits to make them more energy-efficient would be ideal if realized with minimal area cost and time delay. However, there are trade-offs that are typically made when designing integrated circuits regarding area, speed, and power consumption, i.e., optimizing one may detrimentally affect the others. This trade-off concept has been widely documented and used in VLSI integrated circuit design to meet diverse system requirements [22, 23]. The trade-off in terms of cell count, latency, energy dissipation, the number of layers used, fault tolerance, and reliability for several irreversible digital designs to examine the efficiencies of QCA circuits has been well documented [24, 25]. However, the trade-offs between reversible and irreversible QCA designs have yet to be documented.

Torres et al. [20] implemented a logically and physically reversible technique for the QCA combinational half adder circuit. This technique produced results that demonstrated near zero energy computing and inspired the QCA flip-flop circuits in this study. Here, we present the first implementation of the logically and physically reversible QCA design approach to develop reversible sequential QCA circuits. Using the USE clocking scheme, we propose innovative designs for reversible sequential QCA flip-flop circuits. Additionally, new irreversible designs for sequential QCA flip-flop circuits that implement the USE clocking scheme are introduced, totalling eight novel designs. The new designs are reversible set/reset (SR), data (D), Jack Kilby (JK), and toggle (T) flip-flop, as well as irreversible SR, D, JK, and T flip-flop circuits. QCADesigner-E is employed to evaluate the area, delay time, and energy dissipation trade-offs among the proposed reversible and irreversible sequential QCA flip-flop designs.

The delay, number of logic gates, and number of crossovers are essential metrics to measure the performance, complexity, power dissipation, and fabrication difficulties of QCA circuits. The delay is used to measure the speed of QCA circuits; the number of employed majority gates is associated with irreversible power dissipation, whereas the number of crossovers correlates with fabrication complexity. To compare and evaluate QCA circuits, a cost function may be derived as a figure-of-merit (FOM) [26, 27]. Two cost functions have been considered in the present work, the first evaluates designs based on three-input majority gates [26], and the other on five-input majority gates [28].

The suggested circuits in this study employ three-input majority gates; thus, we chose the first cost function [26] that uses three-input majority gates to calculate the cost function. This cost function is given by Eq. 1.

$$ {\text{Cost}}_{{{\text{QCA}}}} = \left( {M^{k} + I + C^{l} } \right){\text{t}} \times T^{p} ,\quad k,\;l,\;p \ge 1 $$
(1)

where M is the number of three-input majority gates, I is the number of inverters, C is the number of crossovers, T is the delay time of the circuit in terms of numbers of clock phases, and k, l, p are the exponential weightings for majority gate count, crossover count and delay time, respectively. The number of inverters is always given a weight of 1, as their presence has a limited impact on the complexity of QCA circuits.

According to the weighting k, l, and p values, the cost function prioritizes various metrics. For example, if speed is a primary concern, more weight can be given to the delay metric, i.e., a higher value of p. If fabrication cost is more important, the value of l should be higher than that of p and k and so on. Therefore, the weight values can be adjusted depending on the overall design optimization goal [26]. However, in the most general cases, a double weighting is given to the number of majority gates M (i.e., k = 2) and the number of crossovers C (i.e., l = 2) [26]. Therefore, in the most general case, the following cost function can be applied:

$$ {\text{Cost}}_{{{\text{QCA}}}} = \left( {M^{2} + I + C^{2} } \right) \times T $$
(2)

In the development of circuits utilizing QCA technology, the majority gate plays a vital role. Using majority gates, which are often accompanied by having some fixed inputs, QCA designers can build a variety of QCA-based logic gates. However, increasing the number of fixed value inputs may result in various restrictions during physical design, such as the location and routing of sophisticated circuits, in addition to manufacturing complexity [29]. The fixed input in a QCA circuit may consist of a single kind of fixed value (0 or 1), two types of fixed values (0 and 1) with equal amounts, or two types of fixed values (0 and 1) with differing quantities [29]. This study combines two types of fixed values (0 and 1), with varying amounts, to construct sequential QCA flip-flop circuits with extremely limited fixed inputs.

The remainder of this paper is organized as follows: In Sect. 2, QCA clocking methodologies are reviewed. In Sect. 3, the QCA design and simulation setup are described. Then, the simulation results are described in Sect. 4, and the conclusions of the study are stated in Sect. 5.

2 Reviewing QCA clocking algorithms

Throughout the logic circuit execution process, clocking control is crucial for synchronizing the data flow accurately and guaranteeing correct operation. In the QCA, an external clock must be applied, to alter the intercellular tunnelling barrier within the QCA cells, to achieve clocking control. Several clocking and timing techniques for regulating data propagation through QCA circuits have been developed.

In 1997, Lent and Tougaw developed an adiabatic switching paradigm that allows time control, solves metastability issues, and facilitates pipelined construction [9]. This methodology involves four 90-degree-shifted clocks to allow the transmission of information across cells [30]. The four distinct stages of the QCA clock are switch, hold, release, and relax, as illustrated in Fig. 3. With this approach, the interdot barriers are simultaneously modulated, over the whole array, as the input states are gradually switched. By switching the array in this fashion, the system is always in the instantaneous ground state. The interdot barriers of the cells are decreased in the initial stage of adiabatic switching, progressively erasing the previous polarizations, set by the previous input. Cells either show minimal or no polarization at the end of this phase. The cells’ interdot barriers are raised during the next stage when the new input is used. The cells repolarize into clearly defined bistable states, as a result of the elevated interdot barriers, eventually reaching the ground state, corresponding to the incoming inputs. This method has the benefits of multiphase clocking and pipelining and enables the QCA array to be partitioned into groups of cells, called clock zones. The clock zone system allows a cluster of QCA cells to perform a particular computation, have its states frozen, and then have its outputs be used as inputs for the adjacent clock zone. Additionally, dividing the QCA wire into zones is akin to trimming it into different small wires, since allowing the QCA wire length to grow can increase the likelihood that cells will not switch correctly, due to thermodynamic constraints [30]. This QCA clocking can be used to synchronize the information, avoiding having a signal reach a logic gate and propagate, before other inputs reach the gate. The duplex nature of QCA is avoided, ensuring that the signal does not go back to the input during its propagation, along the wires and across the logic gates. These characteristics are extremely important for QCA circuits, guaranteeing their correct operation. However, there are many limitations to implementing this one-dimensional adiabatic switching paradigm, such as a substantial difference in wire lengths, clock zones with nonuniform widths, a large difference in the number of cells between zones preventing the implementation of feedback paths and unused area [31].

Fig. 3
figure 3

QCA clocking phases in different clock zones

A two-dimensional QCA clocking method was proposed by Vankamamidi et al. in 2007 [32]. This clocking approach considers zone size and consists of a grid of equal-sized square zones, thereby preventing thermodynamic effects on QCA circuits. However, feedback channels continue to pose a challenge because of their high overhead [31]. In addition, extended cell lines between clocking zones in complex circuits have a detrimental effect on QCA circuits, resulting in increased latency, sluggish timing, and thermal fluctuation sensitivity [31].

In 2015, Campos et al. [31] developed the universal, standard, and efficient (USE) clocking scheme. The USE clocking scheme can meet cell specifications, standardize designs, develop placement and routing algorithms, and exploit other resources that could promote the advancement of QCA technology. Figure 4 demonstrates the USE clocking scheme, which features four clock zones labelled from 1 to 4. A completed clock cycle is realized within these four different clock zones. Each square expresses a clock zone and contains a group of 5 × 5 QCA cell places, while the arrows indicate the data flow between the QCA cells in adjacent clock zones. The USE clocking scheme enables feedback paths with small or large loops and allows for routing simplification due to its flexibility. Feedback loops are essential for sequential circuits, such as flip-flops and memory components.

Fig. 4
figure 4

The USE clocking scheme (squares express the clock zones, and arrows show data flow)

In digital circuits, clock synchronization is necessary to ensure the balance of the data propagation speed and guarantee that the data arrival time is correct, for each logic gate in the circuit [33]. In QCA circuits, one must distinguish between local and global synchronization when evaluating synchronicity. Local synchronisation implies that data can only transfer between cells in consecutively numbered clock zones. The fundamental characteristic of global synchronisation is that new data can be applied to the primary inputs of the circuit, in each clock cycle, which means that it must be assured for all gates, that their inputs are synchronous, for at least one clock cycle, before new inputs arrive. There is consensus emerging that local synchronization is a crucial constraint to implement in QCA design circuits [33,34,35,36]. However, there is a contrast in the literature about global synchronization. Despite many related statements emphasizing the importance of global synchronization [34, 35], some research claims that global synchronization is not a mandatory constraint in QCA circuits [36].

Real clocking is a crucial feature of developing QCA circuits, since it may significantly lower production costs and simplify the physical architecture of QCA circuits. The real clocking concept was incorporated into the QCA clocking techniques in either pipeline format [32] or dynamic format [31, 36]. Furthermore, the real clocking approach, with efficient clustering and placement, was recently developed for complicated circuits that are based on five-input majority gates [37]. In general, the real clocking clustering method works well, for QCA circuits that use majority gates, with more than three inputs [37].

In this study, all proposed designs employed the USE clocking scheme pioneered by Campos et al. [31]. Because of its flexibility, the tile-based USE clocking methodology enables the creation of feedback paths with small or large loops, simplifies the routing process, and creates clock zones with uniform, regular, and bounded forms. Furthermore, current integrated circuit design and fabrication technologies can be utilized to realize USE clocking circuitry. In complex QCA digital circuit designs, clock synchronization, locally and globally, is essential to ensure the balance of the data propagation speed and guarantee that the data arrival time is correct, for the next stage in the circuit [38]. The absence of clock synchronization constraints can lead to the generation of inaccurate bits in the next stage, resulting in incorrect data transmission. In each of our proposed circuits, data are transferred between cells in consecutively numbered clock zones. Furthermore, the input data for each logic gate arrive within four clock phases, i.e., during the same clock cycle. This ensures that all of our designs have local and global synchronization and should produce correct computations.

3 Design and simulation setup

Flip-flops are sequential digital logic circuits in which the output relies on the instantaneous input and the feedback outcome [34]. The feedback characteristic allows flip-flop circuits to store binary data. Thus, flip-flop circuits are essential components in many digital electronic systems, such as computing and memory systems.

In this study, we develop the first ultra-energy-efficient sequential flip-flop circuits that are logically and physically reversible by employing QCA technology. These designs are for the most common flip-flop circuits, including the SR, D, T, and JK flip-flops. In addition, we develop new equivalent irreversible QCA designs of the same flip-flop circuits for comparison. Similar design rules are used for designing these reversible and irreversible circuits to guarantee the reliability of the comparison. The designs are implemented according to the tile-based design concept proposed by Huang et al. [39]. Numerous studies have implemented this concept to facilitate the automated integration of QCA design blocks [40, 41]. The USE clocking scheme with the tile-based design concept is applied to control the timing of the circuits. The wire crossing issue that causes crosstalk interference between two crossing wires is considered one of the main problems in digital QCA circuit design. This issue is resolved by utilizing the multilayer approach studied by Bajec and Pečar [42]. Two different layers are used at the crossed points. Although reversible sequential QCA flip-flop circuits consume less energy than the Landauer limit allows, irreversible QCA flip-flop circuits sometimes require less area and time. The trade-offs among the energy dissipation, area cost, and delay time values between reversible and irreversible sequential QCA USE flip-flop circuits are studied comprehensively in this paper.

Moreover, we implement reversible and irreversible QCA flip-flop circuits utilizing the majority gate as a representative variable. To develop irreversible QCA flip-flop circuits, we used standard majority gates that rely on three binary inputs and one binary output. The standard majority gate is an irreversible logic operation that causes information loss and yields energy dissipation above the Landauer energy limit. However, this operation can become reversible if the number of input and output pins is equal [20, 43]. Thus, to fully develop physically and logically reversible QCA flip-flop circuits, reversible majority gates are used; these gates possess the same binary inputs and outputs by generating copies of the input data. Figure 5 shows the contrast between the standard and reversible QCA majority gates that utilize the USE tile-based clocking scheme. The standard irreversible QCA USE majority gate is depicted in Fig. 5a, while Fig. 5b illustrates the reversible QCA USE majority gate.

Fig. 5
figure 5

a Standard irreversible QCA USE majority gate, b reversible QCA USE majority gate

To simulate the sequential QCA flip-flop designs, the QCADesigner-E simulation tool, which implements the coherence vector simulation engine (CVSE) [21], is used. The implemented time interval for each iteration (Tstep) is 0.τ = 0.1 fs, where τ is the relaxation time. A sufficiently small time step is crucial for decreasing the simulation error and obtaining accurate results. This time step results in simulation errors with an acceptable numerical energy conservation violation, which is given by ϵenv ≤ 5%. All the technology and simulation parameters utilized in this study are listed in Table 1.

Table 1 Technology and simulation parameters used

The QCA USE SR flip-flop circuit is a single-bit data storage device that can store one binary digit (either 0 or 1) and operate in an active-high mode; it can sense that the output is “SET” when S = 1 and R = 0 and that the output is “RESET” when R = 1 and S = 0. The QCA USE SR flip-flop circuit is designed in two different configurations through reversible and irreversible techniques by using a similar Boolean expression to that shown in Eq. 3. Table 2 presents the irreversible QCA SR flip-flop truth table, and Table 3 presents the truth table of the reversible QCA SR flip-flop circuit.

$$ Q_{(t)} = M(S,\;\overline{R},\;Q_{(t - 1)} ) = S.\overline{R} + S.Q_{(t - 1)} + \overline{R}.Q_{(t - 1)} $$
(3)
Table 2 Irreversible SR flip-flop truth table
Table 3 Reversible SR flip-flop truth table

The proposed irreversible QCA USE SR flip-flop utilizes only one majority gate. The synthesis of this design shown in Fig. 6a, b shows the circuit layout with the implementation of the USE clocking scheme. This QCA USE SR flip-flop has two inputs: S represents the “SET” position, and R represents the “RESET” position. The stored bit is present on the output marked \(\mathrm{Q}\), and \(\overline{\mathrm{Q}}\) shows the complement binary digit of \(\mathrm{Q}\). The delay time of this circuit is three clock zones (0.75 clock cycle), while the required area cost is only five tiles.

Fig. 6
figure 6

Proposed irreversible QCA USE SR flip-flop circuit a schematic b layout

The offered reversible QCA USE SR flip-flop circuit also uses only one majority gate to store a single bit of data. Figure 7a shows the design schematic, and Fig. 7b depicts the circuit layout implemented in the USE clocking scheme. Note that, in the reversible QCA flip-flop circuit design diagrams, the "cp" labels of the outputs refer to copies of the input information, and "g1" and "g2" indicate the so-called garbage outputs. The reversible QCA USE SR flip-flop is similar to the irreversible ones in terms of their area costs and latencies. The circuit delay is three clock zones (0.75 clock cycle), while the required area cost is five tiles.

Fig. 7
figure 7

Proposed logically and physically reversible QCA USE SR flip-flop circuit a schematic b layout (Scp and Rcp indicate copies of the inputs)

The QCA USE D flip-flop, where D refers to the data input, is controlled by a clock signal CLK. By combining the two input signals (S and R) of the SR flip-flop and adding an inverter, we can SET and RESET the flip-flop using just one input, as the two input signals are complements of each other. This complement bypasses the forbidden state of the SR flip-flop, where S and R are never equal to 1 simultaneously. Both the proposed reversible and irreversible QCA USE D flip-flops consist of three majority gates to store a single bit of data. The Boolean expression for these QCA USE D flip-flops is given in Eq. 4, where D is the input data, CLK is the clock data, and Q is the output with the stored; the truth tables of the irreversible and reversible QCA USE D flip-flops circuits are displayed in Tables 4 and 5, respectively.

$$ \begin{aligned} & M_{1} = M\left( {D,\;{\text{CLK}},\;0} \right) \\ & M_{2} = M(\overline{D},\;{\text{CLK}},\;0) \\ & M_{3} = M(M_{1} ,\overline{M}_{2} ,Q_{(t - 1)} ) \\ & Q_{\left( t \right)} = M \left( {M \left( {D,{\text{CLK}},0} \right),\overline{{M \left( {\overline{D},{\text{CLK}},0} \right)}} ,Q_{{\left( {t - 1} \right)}} } \right) = D.{\text{CLK}} + \overline{{{\text{CLK}}}} .Q_{{\left( {t - 1} \right)}} \\ \end{aligned} $$
(4)
Table 4 Irreversible D flip-flop truth table
Table 5 Reversible D flip-flop truth table

The designed irreversible QCA USE D flip-flop circuit uses three majority gates for storing a binary digit. Accordingly, the output is “SET” when CLK = 1 and D = 1, “RESET” when CLK = 1 and D = 0, and “hold data” (no change) when CLK = 0. Figure 8a demonstrates the synthesis of the proposed irreversible QCA USE D flip-flop design, and Fig. 8b shows the circuit layout. The circuit delay is seven clock zones (1.75 clock cycle). Furthermore, this irreversible QCA USE D flip-flop circuit requires an area of 13 tiles.

Fig. 8
figure 8

Proposed irreversible QCA USE D flip-flop circuit a schematic b layout

Figure 9a demonstrates the synthesis of the proposed reversible QCA USE D flip-flop design, and Fig. 9b shows the circuit layout. The circuit delay is eight clock zones (2 clock cycles). Furthermore, this reversible QCA USE D flip-flop requires an area of 14 tiles.

Fig. 9
figure 9

Proposed reversible QCA USE D flip-flop circuit a schematic b layout (Dcp and CLKcp indicate copies of the inputs, whereas g1 and g2 are so-called garbage outputs)

The QCA USE JK flip-flop behaves similarly to the SR flip-flop, where J is equivalent to the S input, and K represents the R input. The additional merit is that JK flip-flop is supplemented by a control clock signal that produces a toggle output value when CLK = 1, J = 1, and K = 1. Accordingly, the output is "SET" when CLK = 1, J = 1, and K = 0, "RESET" when CLK = 1, J = 0, and K = 1, and "hold data" (no change) when CLK = 0, or when J = 0 and K = 0, even if CLK = 1. Both the proposed irreversible and reversible QCA USE JK flip-flop circuits employ five majority gates to store a single bit of data. The Boolean expression for the QCA USE JK flip-flop designs is given in Eq. 5, where J and K are the input data, CLK is the clock data, and Q is the stored bit. Tables 6 and 7 present the truth tables of the irreversible and reversible circuits, respectively.

$$ \begin{aligned} & M_{1} = M\left( {J,0,{\text{CLK}}} \right) \\ & M_{2} = M\left( {K,0,{\text{CLK}}} \right) \\ & M_{3} = M(M_{1} ,0,\overline{Q}_{(t - 1)} ) \\ & M_{4} = M(\overline{M}_{2} ,0,Q_{(t - 1)} ) \\ & M_{5} = M \left( {M_{3} ,1,M_{4} } \right) \\ & Q\left( t \right) = M \left( {M \left( {M \left( {J,0,{\text{CLK}}} \right),0,\overline{Q}_{{\left( {t - 1} \right)}} } \right), 1,M \left( {\overline{{M \left( {K,0,{\text{CLK}}} \right)}} ,0,Q_{{\left( {t - 1} \right)}} } \right)} \right) = J.{\text{CLK}}.\overline{Q}_{{\left( {t - 1} \right)}} + (\overline{{K.{\text{CLK}}}} ).Q_{{\left( {t - 1} \right)}} \\ \end{aligned} $$
(5)
Table 6 Irreversible JK flip-flop truth table
Table 7 Reversible JK flip-flop truth table

Figure 10a demonstrates the synthesis of the developed irreversible QCA USE JK flip-flop design, and Fig. 10b shows the circuit layout. The circuit delay is five clock zones (1.25 clock cycle). Furthermore, this irreversible QCA USE JK flip-flop requires an area of 9 tiles.

Fig. 10
figure 10

Proposed irreversible QCA USE JK flip-flop circuit a schematic b layout

Figure 11a demonstrates the synthesis of the proposed reversible QCA USE JK flip-flop design, and Fig. 11b shows the circuit layout. The circuit delay is seven clock zones (1.75 clock cycle). Furthermore, this reversible QCA USE JK flip-flop requires an area of 13 tiles.

Fig. 11
figure 11

Proposed reversible QCA USE JK flip-flop circuit a schematic b layout (Jcp, Kcp and CLKcp indicate copies of the inputs, whereas g1, g2, g3 and g4 are so-called garbage outputs)

Finally, the QCA USE T flip-flop circuits are constructed based on the JK flip-flop design. The T flip-flop has only one input connected to the two inputs of the JK flip-flop. This single input T represents the toggle, suggesting that this flip-flop can work as a toggle switch. The T flip-flop output is toggled when CLK = 1 and T = 1; otherwise, it keeps the previous output. Both the proposed reversible and irreversible QCA USE T flip-flop circuits employ four majority gates to store a single bit of data. The Boolean expression for the QCA USE T flip-flop designs is given in Eq. 6, where T is the input data, CLK is the clock data, and Q is the stored bit. Table 8 presents the truth table of the irreversible circuit, while Table 9 presents the truth table of the reversible circuit.

$$ \begin{aligned} & M_{1} = M \left( {T,0,{\text{CLK}}} \right) \\ & M_{2} = M (M_{1} ,0,\overline{Q}_{{\left( {t - 1} \right)}} ) \\ & M_{3} = M (\overline{M}_{1} ,0,Q_{{\left( {t - 1} \right)}} ) \\ & M_{4} = M \left( {M_{2} ,1,M_{3} } \right) \\ & Q_{\left( t \right)} = M \left( {M \left( {M \left( {T,0,{\text{CLK}}} \right),0,\overline{Q}_{{\left( {t - 1} \right)}} } \right), 1,M \left( {\overline{{M \left( {T,0,{\text{CLK}}} \right)}} ,0, Q_{{\left( {t - 1} \right)}} } \right)} \right) = T.{\text{CLK}}. \overline{Q}_{{\left( {t - 1} \right)}} + \overline{T}. Q_{{\left( {t - 1} \right)}} + \overline{{{\text{CLK}}}} . Q_{{\left( {t - 1} \right)}} \\ \end{aligned} $$
(6)
Table 8 Irreversible T flip-flop truth table
Table 9 Reversible T flip-flop truth table

The synthesis of the irreversible QCA USE T flip-flop design is shown in Fig. 12a, and Fig. 12b shows the circuit layout with the implementation of the USE clocking scheme.

Fig. 12
figure 12

Proposed irreversible QCA USE T flip-flop circuit a schematic b layout

Figure 13a shows the reversible QCA USE T flip-flop schematic, and Fig. 13b depicts the circuit layout without any cost penalty compared to the irreversible circuit. The latencies and utilized areas are similar for both the proposed standard and reversible QCA USE T flip-flop designs, with six clock zone time delays (1.5 clock cycle) and required areas of ten tiles.

Fig. 13
figure 13

Proposed reversible QCA USE T flip-flop circuit a schematic b layout (Tcp, Kcp and Qcp indicate copies of the inputs, whereas g1, g2 and g3 are so-called garbage outputs)

4 Simulation results

This section shows the simulated input/output response values for the eight proposed reversible and irreversible designs for the four flip-flop types that have been considered. The majority gate count, area cost, delay time, and energy dissipation for each proposed QCA USE flip-flop circuit are also illustrated here. QCADesigner-E software, employing the coherence vector engine, was used to validate the performance of the circuits and evaluate the energy dissipation. The majority gate count, occupied area, and delay time were calculated directly from the layout structures of the circuits.

Figure 14 shows the simulation results for the proposed irreversible QCA USE SR flip-flop, as shown in Fig. 6. The simulation input/output in the case of Q(t−1) = 0 is represented in Fig. 14a, while Fig. 14b shows the input/output of Q(t−1) = 1. The results confirm that the desired circuit computation agrees with Table 2. Figure 15a, b shows the input/output values for the proposed reversible QCA USE SR flip-flop circuit for the cases Q(t−1) = 0 and Q(t−1) = 1, respectively; the design of this circuit is shown in Fig. 7. The simulation results verify that the circuit computation results agree with Table 3.

Fig. 14
figure 14

Simulation waveforms of the proposed irreversible QCA USE SR flip-flop a for Q(t−1) = 0, b for Q(t−1) = 1

Fig. 15
figure 15

Simulation waveforms of the proposed reversible QCA USE SR flip-flop a for Q(t−1) = 0, b for Q(t−1) = 1

The simulation results for the second proposed QCA flip-flop—the D flip-flop—are presented in Figs. 16 and 17. Figure 16a, b represents the input/output values for the proposed irreversible QCA USE D flip-flop for the cases Q(t−1) = 0 and Q(t−1) = 1, respectively; the design of this circuit is shown in Fig. 8. The simulation results confirm that this circuit computation model is consistent with Table 4. Furthermore, Fig. 17a, b represents the input/output values for the proposed reversible QCA USE D flip-flop for the cases Q(t−1) = 0 and Q(t−1) = 1, respectively; the design of this circuit is shown in Fig. 9. The simulation results confirm that this circuit computation model agrees with Table 5.

Fig. 16
figure 16

Simulation waveforms of the proposed irreversible QCA USE D flip-flop a for Q(t−1) = 0, b for Q(t−1) = 1

Fig. 17
figure 17

Simulation waveforms of the proposed reversible QCA USE D flip-flop a for Q(t−1) = 0, b for Q(t−1) = 1

In Figs. 18 and 19, the simulation results of the irreversible and reversible QCA JK flip-flop circuits, whose designs are given in Figs. 10 and 11, are shown. Figure 18a, b represents the input/output values for the proposed irreversible QCA USE JK flip-flop for the states Q(t−1) = 0 and Q(t−1) = 1, respectively. The simulation results verify the desired circuit computation results, and the findings agree with Table 6. Likewise, Fig. 19a, b illustrates the input/output values for the reversible QCA USE JK flip-flop for the states Q(t−1) = 0 and Q(t−1) = 1, respectively. The simulation results verify the desired circuit computation results, and the findings agree with Table 7.

Fig. 18
figure 18

Simulation waveforms of the proposed irreversible QCA USE JK flip-flop a for Q(t−1) = 0, b for Q(t−1) = 1

Fig. 19
figure 19

Simulation waveforms of the proposed reversible QCA USE JK flip-flop a for Q(t-−1) = 0, b for Q(t−1) = 1

Finally, Figs. 20 and 21 show the simulation results of the irreversible and reversible QCA T flip-flop circuits, the designs of which are given in Figs. 12 and 13, respectively. Figure 20a, b represents the input/output values for the proposed irreversible structure of the QCA USE T flip-flop for the states Q(t−1) = 0 and Q(t−1) = 1. The simulation results verify the desired circuit computation results and are consistent with Table 8. Similarly, Fig. 21a, b illustrates the input/output values for the reversible arrangement QCA USE JK flip-flop for the states Q(t−1) = 0 and Q(t−1) = 1, respectively. The simulation results verify the desired circuit computation results, and the findings agree with Table 9.

Fig. 20
figure 20

Simulation waveforms of the proposed irreversible QCA USE T flip-flop a for Q(t−1) = 0, b for Q(t−1) = 1

Fig. 21
figure 21

Simulation waveforms of the proposed reversible QCA USE T flip-flop a for Q(t−1) = 0, b for Q(t−1) = 1

The simulation waveforms confirmed the reliability of the investigated designs, where each waveform is consistent with the truth table and the Boolean equation for the circuit. The number of majority gates used for developing a specific flip-flop type in the irreversible and reversible styles does not change for all the proposed flip-flops. The utilized majority gates for constructed irreversible and reversible QCA USE flip-flop circuits are 1 for the SR flip-flops, 3 for T flip-flops, and 5 for JK and T flip-flops (see Table 11).

The crucial benefit of designing digital circuits in a fully (logically and physically) reversible manner is energy efficiency. The energy dissipation has improved significantly in the fully reversible circuits, compared with those irreversible circuits, to values less than the Landauer energy limit. For the proposed irreversible QCA USE SR flip-flop, the average dissipated energy for the different input signal binary combinations is 0.365 meV. This value drops significantly by 97.8% to 0.008 meV in the reversible implementation of the same circuit. Likewise, in the irreversible QCA USE D flip-flop, the average dissipated energy for the different input signal binary combinations is 1.007 meV. This value declines remarkably by 95.5% to 0.045 meV in the reversible implementation of the same flip-flop circuit. Additionally, for the irreversible QCA USE JK flip-flop, the average dissipated energy for the different input signal binary combinations is 1.609 meV. This value falls by 98.4% to 0.026 meV in the fully reversible implementation of the same circuit. Finally, for the irreversible QCA USE T flip-flop, the average dissipated energy for the different input signal binary combinations is 1.129 meV. This value drops substantially by 96.9% to 0.035 meV in the fully reversible implementation of the same circuit.

To our knowledge, no earlier research has examined QCA flip-flop circuits that are fully (logically and physically) reversible. The proposed fully reversible flip-flops in this study ensure that each internal majority gate, that makes up the circuit, is reversible and has the same number of input and output signals. This makes them highly energy-efficient flip-flops. We have compared our newly suggested flip-flops to the latest logically reversible QCA flip-flops described in the literature, to evaluate the energy consumption efficiency of the proposed circuits. Logical reversibility refers to flip-flops that consist of reversible gates such as the 3 × 3 Fredkin gate and the 2 × 2 Feynman gate, although the internal majority gates that make up these gates are irreversible. As Table 10 shows, proposed fully reversible USE flip-flops consume much less energy at the 1Ek tunnelling energy level than flip-flops in the historical technical literature [18, 44,45,46]. Each type of the proposed flip-flop uses almost 98% less energy, with values that are close to zero.

Table 10 Average and total energy dissipation comparison

Obviously, developing QCA sequential flip-flop digital circuits in a physically and logically reversible fashion leads to an energy dissipation lower than the Landauer energy limit. Nevertheless, sometimes the reduced energy dissipation is accompanied by a trade-off in area cost and/or latency time, as in the case of D, JK, and T flip-flop circuits.

For USE SR flip-flop designs, there are no additional area costs or latency time penalties among the proposed irreversible and reversible circuits. The utilized area for developing the irreversible and reversible USE SR flip-flops is 0.060 µm2, and the delay time is 3 clock zones (0.75 clock cycles). In contrast, there are higher area costs and/or delay times when implementing the USE D, JK, and T flip-flops reversibly, than when implementing irreversible designs of the same circuits. The area used for constructing irreversible USE D flip-flops is 0.1686 µm2, with a delay time of 7 clock zones (1.75 clock cycles). These values increase by 10.7% to 0.1888 µm2 areas occupied and 14.3% to 8 clock zones (2 clock cycles) latencies in the reversible USE D flip-flop design case. Similarly, in the USE JK flip-flop, the area used for constructing the irreversible USE JK flip-flop is 0.1089 µm2 with a delay of 5 clock zones (1.25 clock cycles). These values increase by 112.9% to 0.2328 µm2 area costs and 40% to 7 clock zones (1.75 clock cycles) delay times in the reversible USE JK flip-flop design. In the USE T flip-flop, the area used for developing the irreversible USE T flip-flop is 0.1288 µm2 and increases by 18.9% to 0.1589 µm2 for developing the reversible USE T flip-flop, whereas the delay time is 6 clock zones (1.5 clock cycles) for both the developed irreversible and reversible USE T flip-flops.

Furthermore, we compared our proposed flip-flops to the most recently proposed flip-flops in the literature [15, 44, 46,47,48], to evaluate their performance. We have used the cost function in [26], to compare the flip-flops we present in this paper, to those in the literature. The proposed irreversible USE flip-flops achieve cost improvements of 79.55%, 9%, 16.67%, and 20.59% for the SR, D, JK, and T flip-flops, respectively, compared to the previous best designs [44, 46,47,48]. Table 11 shows a detailed comparison between the proposed irreversible and reversible QCA USE flip-flop circuits, besides the most recent flip-flop circuits in the literature. It shows the number of majority gates and inverters used, the number and type of fixed inputs, the number and type of crossovers, the area cost, the cell count, the delay time, and the cost function.

Table 11 Flip-flop performance evaluation

5 Conclusion

In this work, we confirmed through simulation that developing sequential QCA digital circuits in a logically and physically reversible manner yields computing circuits that dissipate less energy than the Landauer limit (kBTln2). We have designed the first sequential QCA flip-flops that are both logically and physically reversible for the most common flip-flop digital circuits. In addition, these flip-flop circuits were also developed with irreversible implementation to compare the energy dissipation, occupied area, and latency costs of the reversible and irreversible designs. The proposed QCA flip-flop digital circuit designs were the SR, D, JK, and T flip-flops. All designed circuits were constructed around the majority gate. The USE clocking scheme that implements feedback paths was employed to achieve timing control. The QCADesigner-E tool was used for calculating both the polarization input/output waveform response and the energy dissipated for the sequential QCA designs based on a microscopic quantum mechanical model of the QCA cell.

The simulation results confirmed that it is possible to design sequential QCA circuits that are logically and physically reversible. Designing QCA flip-flop circuits logically and physically reversibly decreased the energy dissipation by more than 95%. The simulation results demonstrated ultralow energy dissipation levels with energies less than the Landauer limit of 0.06 meV at a temperature of 1 K.

Although designing sequential QCA flip-flop circuits in a reversible method reduces the energy dissipation significantly to values less than the Landauer energy limit, the process sometimes increases the area cost and/or the delay time to values greater than those of the irreversible sequential QCA flip-flop circuits. The trade-off between the energy dissipation versus the area cost and the delay time for the reversible and irreversible QCA circuits was also investigated systematically in this study. The results showed that irreversible QCA USE flip-flop designs had area costs and delay times that were lower than or equal to those of the reversible designs. In USE SR flip-flop design, there are no area costs or latency time penalties among the irreversible and reversible circuits. However, the area costs and delay times have increased when developing the USE D, JK, and T flip-flops. The area costs of reversible USE D, JK, and T flip-flop circuits are increased by 10.7%, 112,9% and 18.9%, respectively, compared with irreversible circuits. Additionally, the delay times of the reversible USE D and JK flip-flops have been raised by 14.3% and 40% more than irreversible circuits.

The energy consumption of the logically and physically reversible flip-flops suggested here is much lower than that of the flip-flops introduced in the literature. Compared to earlier designs, the energy consumption of each type of flip-flop has decreased by approximately 98%. In addition, the cost of the proposed irreversible SR, D, JK, and T flip-flops is reduced by 91.67%, 9%, 40.79%, and 20.5%, respectively, as compared to the previously proposed flip-flop.

The results of this study provide support for the serious consideration of QCA as an alternative to overcome the integration limitations of conventional irreversible CMOS computation technologies. Future work is necessary to explore the logically and physically reversible design technique applied to more sophisticated QCA computing systems consisting of combined combinational and sequential logic circuits.