1 Introduction

A trend observed in the last decades in the area of nanoelectronics is the continuous replacement of analog functions for digital ones, what has contributed to the current predominance of digital circuits in the implementation of practically all functions found in integrated circuits. The reasons for this replacement are various, among them the size reduction of transistors, which allows the development of sophisticated digital circuits, the high level of development of CAD tools to support digital designs, not observed with tools for analog designs, the difficult with the re-use of analog circuits, and the smaller benefits that analog circuitries enjoy with the new technological nodes [1, 2].

Despite this trend, there are some typical functions that remain analog. They are those that generate references, like oscillators and voltage reference sources, those functions that handle power voltages, those functions that work with the input/output side of systems, like amplifiers and drivers, and those functions that work with mixed analog/digital circuits interface, like analog-to-digital and digital-to-analog converters (ADC and DAC). ADCs and DACs, more than any other function, hardly can be replaced. Their main performance parameters are resolution, sampling rate, and power consumption, that assume values that vary widely from application to application, giving scope to the development of dozens of different converter topologies.

This work presents the design and implementation of a low power, differential, asynchronous successive approximation register analog-to-digital converter (SAR ADC) in a 65-nm CMOS technology, with a flexible sampling rate ranging from a few kS/s up to 12 MS/s. The SAR architecture, which presents few analog blocks (only switches and one comparator) and a simple design, is well suited for moderate resolution and low-to-medium data rate applications [3]. The designed converter is fit for use in a wide spectrum of low power systems and subsystems, such as neural signal recorders and frontend of wireless receivers.

Technologies that enable neurologists and clinicians to observe the simultaneous activity of many neurons have a great demand [4]. A single neuronal signal needs, for its capture, an ADC with no more than 20 kS/s, at least 10 bits, and minimum power consumption in order to not damage the living cells and extend batteries life. In the case of simultaneous neural recording, with multi-electrodes sharing a single ADC, the sampling rate will increase by hundreds of times. The proposed ADC can handle from a single neural signal, with power consumption lower than 1.0 μW, up to almost 500 diverse neural signals.

At the other side of the spectrum, the development of wireless network standards defining the physical layer of low data rate and very long battery life frontends, like IEEE 802.15.4 and Bluetooth Low Energy (BLE), has triggered the evolution of wireless hardware for applications like the internet of things (IoT). Solutions based on IoT for a wide range of applications, such as transportation, manufacturing, and consumer electronics, have increased over the past few years due to its potential for cost reduction and business processes improvement [5]. Although the demand for software development is predominant in these fields, the hardware must also evolve, otherwise it will become a bottleneck for the IoT systems. The designed ADC can also be employed in the frontend of IEEE 802.15.4 standard receivers, like Zigbee and Thread, working in bandwidths ranging from 200 kHz up to 5.0 MHz.

In this paper, a more comprehensive account of the SAR ADC first presented in [6] is done. In the current design a few improvements were introduced, such as a circuitry that avoids short circuit of the power sources due to switches driving simultaneously, and a circuitry that prevents a metastability event from stopping the ADC operation. In the ADC description, more circuits are shown and explained, and the full ADC operation is detailed. Because of the modifications, all the results were reviewed. Additionally, new results about power consumption of ADC parts and the evaluation of the Conversion Error Rate (CER) are included.

This paper is organized as follow. Section 2 shows the ADC main blocks and their operation. Section 3 presents the ADC layout and what care has been taken with its design. Simulation results for different conditions are presented and analyzed in Sect. 4. Finally, in Sect. 5, conclusions are drawn.

2 ADC architecture and design

In this work, an asynchronous implementation of the SAR ADC was employed. Typical synchronous implementations [2] make use of two synchronized clock sources, one to control the data sampling (the sampling clock) and other to control the converter iterations (the SAR iteration clock). On the other hand, asynchronous SAR ADCs [7, 8] require only one clock signal to control the sampling. This feature relaxes the clock distribution requirements. Additionally, the designed ADC has a differential architecture to minimize the effects of common-mode noise and to reduce the even order harmonics distortion [2]. A block diagram of the differential asynchronous SAR ADC is presented in Fig. 1.

Fig. 1
figure 1

The asynchronous SAR ADC block diagram

The ADC is composed by a pair of input switches, implemented using the bootstrapped technique to improve linearity [9], by a capacitive DAC, by a dynamic comparator, by an asynchronous logic, and by output registers. An external sampling clock (sample_clk) controls the bootstrapped switches and the asynchronous logic circuit that sets the ADC to either sampling or conversion phase. When the sampling clock is set to low (sampling phase), the bootstrapped switches turn on and the capacitor array of the DAC is charged according to the differential input signal. In this phase, the power consumption of the ADC is very low because there is no digital circuit activity and the analog circuits present low current consumption.

When the sampling clock is set to high (conversion phase), the bootstrapped switches turn off and the SAR procedure is triggered, starting the binary search process which takes N iterations, where N is the resolution of the ADC. In the first iteration, vdac_p and vdac_n signals remain constant. From the second iteration on, the vdac_p and vdac_n signals themselves have their values modified by (N − 1) additions/subtractions of VFS/2i, where VFS is the single full-scale voltage (VDD) and i is the iteration number, ranging from 2 up to N.

The timing diagram of the asynchronous SAR ADC is shown in Fig. 2 [10]. In the diagram are pointed out the sample_clk, the internal clock with variable frequency that controls the search process iterations (clkc), and the time intervals at which the signal sampling (ttrack), the comparator evaluation or regeneration (treg), the comparator reset, and the DAC charge redistribution (tfix) are performed. In Fig. 3, the behavior of the sample_clk, clkc, vdac_p, vdac_n, and the end of conversion flag (eoc) are shown with details during the evaluation of the three least significant bits.

Fig. 2
figure 2

Timing diagram of an asynchronous SAR ADC

Fig. 3
figure 3

Internal signal waveforms during the SAR procedure for the three least significant bits

For each one of the iterations presented in Fig. 3, the DAC charge redistribution and the comparator reset are first triggered. After a fixed time interval tfix, the clkc signal falls, indicating that a new differential input is provided for the comparator and that the comparator itself is ready. Next, a comparison (regeneration) is performed in a variable interval treg and, when the result is ready, a signal is sent to the asynchronous logic circuit, causing the rising of clkc. At the end of the current iteration, the asynchronous logic commands the storage of the result and the beginning of a new iteration.

The eoc flag indicates that all the SAR iterations were successfully performed, and the conversion result is available at the output registers. The time delay between the beginning of the conversion phase and the rise of the eoc flag is variable and is mostly limited by the sum of the different regeneration times of the N comparisons and the fixed combinational circuits delays. For the iteration in which |(vdac_p − vdac_n)|= ΔVi is the smallest, the regeneration time, treg,hard, can be much higher than the regeneration time of the other iterations. Conveniently, the conversion phase interval is not evenly distributed between the N iterations, but the time is allocated according to the degree of difficulty of the comparison, preventing, as far as possible, the comparator from remaining in metastability.

Most of the ADC current consumption is observed to occur in the interval between the rising of the sampling clock signal (triggering of the asynchronous logic circuit) and the rising of the eoc flag. Since the size of this time interval and the amount of operations performed in it do not scale with the sampling frequency, it is expected an almost linear relation between power consumption and sampling frequency [8]. This relation is compromised for very low sampling frequencies, once in this case the ADC power consumption is dominated by the static power consumption.

2.1 The switched capacitor network

The DAC capacitor array of a SAR ADC is also the capacitor responsible for tracking the input voltage during the sampling phase and for holding the charge throughout the conversion phase. Due to charge redistributions, different voltage values are set in the capacitor array, according to a defined switching scheme, and applied to the comparator.

The capacitive DAC structure employed in our design is shown in Fig. 4. A pair of identical capacitor arrays is placed in vdac_p and vdac_n inputs (the p array and the n array). To improve the capacitor matching, capacitors of different sizes within the capacitor arrays are implemented by using multiple unitary capacitors of value Cu. A split binary-weighted capacitive array structure [11] was applied in this design to considerably reduce the required number of unitary capacitors in the DAC. In the structure, M binary-weighted capacitors are placed in the left side of a bridge capacitor and compose the called main DAC, and L binary-weighted capacitors are placed in the right side of the bridge capacitor and compose the sub DAC.

Fig. 4
figure 4

The capacitive DAC array

To reduce the power consumption, the merged capacitor switching scheme (MCS) was adopted for the search process [12]. It provides good switching efficiency and a low digital circuitry complexity. It presents three-level capacitor array voltage switches (to VREFP, VREFN, and common-mode VCM = (VREFP + VREFN)/2 voltage levels) in contrast with the conventional two-level capacitor array voltage switch (to VREFP and VREFN).

The switching scheme works as follows: at the sampling phase of the ADC, the bottom plates of all capacitors in the main and the sub DACs are shorted to the common-mode voltage VCM and the top plate of the main DAC capacitors are connected to the differential input signal. During the conversion phase, the DAC is decoupled from the input signal and N iterations are performed. The sequence of steps in the conversion phase is as follow: in the first iteration, vdac_p and vdac_n are compared and the result, pointing out if vdac_p is higher or lower than vdac_n, is stored as “0” or “1” in the most significant bit (MSB) of the output register. In the second iteration, if it was found that vdac_p was higher than vdac_n in the previous iteration, CpN-2 and CnN-2 are connected to VREFN and VREFP, respectively, or, if vdac_p was lower than vdac_n, to VREFP and VREFN. A new comparison between vdac_p and vdac_n is performed, and the result is stored in second most significant bit of the output register. In the next iterations, the remaining capacitors are successively connected to either VREFP or VREFN according to the comparison result of their previous steps, vdac_p and vdac_n are again compared, and the result is stored. From the first step to the last one, ten comparisons and nine charge redistributions are performed to obtain the ten bits. After the last iteration, vdac_p and vdac_n will be close to VCM.

M and L, the number of switches in the main and in the sub DAC, were set to 5 and 4, respectively, because this choice leads to the minimal number of unitary capacitors needed to implement the arrays, improving the matching and connection routing between these devices.

In typical implementations of the split capacitive array structure, there is an additional unitary capacitor Cf in parallel with the capacitors of the sub DAC. This capacitor Cf has one of its plates permanently connected to VCM, and it is necessary to provide correct binary weighted steps in the switching. In this case, the bridge capacitor (Cbridge) should ideally present a capacitance of (16/15)⋅Cu, [11] when L = 4. Nonetheless, it is a common practice to use Cbridge = Cu for a better layout and matching, and, as a result, the ADC transfer curve will present several output codes with twice the expected length.

In the implementation of Fig. 4, the capacitor Cf is not present. In this case, the bridge capacitor should present a value close to (1.0669)⋅Cu. When the bridge capacitor is implemented as a unitary capacitor, a common practice as mentioned, a − 1.0 LSB (Least Significant Bit) offset and a tolerably small gain error are introduced in the DAC transfer curve [13].

In addition to the errors caused by the bridge capacitor value, parasitic capacitances between nodes vdac_n, vdac_p, vdac_nx, and vdac_px and VDD or ground will introduce an additional offset and gain errors. If the parasitic capacitances in the nodes vdac_n/vdac_p are Cp and in the nodes vdac_nx/vdac_px are Cpx, the offset error is around (16⋅Cp + Cpx)/Cu LSB. Note that errors caused by the parasitic in the nodes vdac_n/vdac_p are by far the most important.

Since the offset errors due to the parasitic capacitors are probably higher than 1.0 LSB and the DAC implementation without Cf and with Cbridge = Cu causes only offset and small gain errors, but not alteration in the code length of the transfer curve, such implementation was chosen for application in the ADC.

Although the split array structure reduces the number of unitary capacitors, as compared to a conventional binary-weighted capacitive DAC, the minimum value for Cu to meet the mismatch requirements is higher. For a differential array to provide 3-sigma yield, the minimum value for Cu is given by [13]

$$C_{u} \ge { }9\left( {2^{M} - 1} \right)2^{{2\left( {N - M} \right)}} K_{\sigma }^{2} K_{C}$$
(1)

where KC is the capacitor density parameter (C = KCArea) and Kσ is the capacitor matching coefficient (σ(ΔC/C) = Kσ/Area1/2).

Using Metal–Insulator-Metal (MIM) capacitors for better matching coefficient, Eq. (1) resulted in a minimum Cu of 42 fF, with dimension of 4.4 µm × 4.4 µm. This value also meets with the requirement that the thermal noise introduced by the switches must be smaller than the quantization noise of the ideal 10-bit ADC [2]. Figure 5 presents the Differential Nonlinearity (DNL) errors caused only by the DAC capacitor mismatches. The DNL errors mean and standard deviation are zero and 0.098 LSB, respectively. The capacitor mismatches also influence the effective number of bits, ENOB, which will have mean of 9.94 and standard deviation of 0.047. These results show that the performance degradation caused by the chosen Cu value is small.

Fig. 5
figure 5

DNL error caused by DAC capacitors mismatch, obtained through Monte Carlo simulations (DNL precision of 0.1 LSB)

2.2 The bootstrapped switch

A basic sample and hold circuit is composed by a hold capacitor and a switch. In our ADC, the hold capacitor is the capacitor array of the DAC, described and sized in Sect. 2.1, whereas the switch is depicted in this section.

To provide a low distortion differential sampling, bootstrapped switches [9] were considered in the design. These switches keeps the gate to source voltage of an effective switch transistor nearly constant. The schematic of the employed circuit is shown in Fig. 6.

Fig. 6
figure 6

The bootstrapped switch schematic

In this circuit, transistors M1 and M2, capacitors C1 and C2 and the inverter Inv1 constitute a voltage doubler necessary to the operations of transistor M3, transistors M3 and M4 are used in the charge of the capacitor C3, transistors M5, M6, and M8 and the inverter Inv2 are responsible for applying a voltage to the gate of transistor M11, and transistor M11 is the effective switch transistor. The bootstrapped switch works as follows: when sample_clk is set to high logic level, va is clamped to approximately 2VDD and both transistors M3 and M4 are turned on, charging the capacitor C3 to VDD. At the same time, node vg_switch is connected to VSS through transistor M8, switching off the effective switch transistor M11 (switch off). When sample_clk is set to low logic level, va is clamped to approximately VDD and both M3 and M4 are turned off. Now, node vg_switch is connected to the top plate of C3 through transistor M5, node vin is connected to the bottom plate of C3 through transistor M6, and M11 is turned on (switch on) with a gate to source voltage of VDD. In this circuit, transistors M9 and M10 are responsible for turning on/off the transistors M6 and M5, respectively, according to the above-mentioned phases. Transistor M7 is added to alleviate the drain to source voltage experienced by M8.

Considering transistor M11 is implemented with the lowest possible length, the width of M11, W11, must be carefully determined because the charge injection limits its maximum value, and the bandwidth, on the other hand, limits its minimum value.

For a NMOS switch, the charge injection voltage error can be approximated to [14]

$$\Delta V_{e} = \frac{{ - kW_{N} L_{N} C_{ox} \left( {V_{G} - V_{TH} - vin } \right)}}{{C_{H} }}$$
(2)

where k is the fraction of the channel charge injected into the capacitor array, CH is the capacitance value of the array, VG is the gate voltage at sampling time, and vin is the input voltage at the end of the sampling time.

In the bootstrapped circuit, (VG − vin) = VDD and, at first analysis, the charge injection would affect vdac_p and vdac_n evenly. In this case, the ADC evaluation would not be affected by the charge injection. Unfortunately, VTH depends on vin, which will result in different charges redistributed to each side of the differential input. The errors due to charge injection should not be greater than VLSB = VFS/2N, and the maximum allowed W11 is constrained by

$$\left| {\Delta V_{e1} - \Delta V_{e2} } \right|_{max} = \frac{{kW_{11} L_{11} C_{ox} \left( {V_{TH} \left( {V_{DD} } \right) - V_{TH} \left( 0 \right)} \right)}}{{C_{H} }} \le \frac{{V_{FS} }}{{2^{N} }}$$
(3)

where ΔVe1 and ΔVe2 are the voltage errors of the two switches, and VTH(VDD) and VTH(0) are the threshold voltage of transistor M11 when vin = VDD and vin = 0 V, respectively.

The impact of the sizing of W11 in the track bandwidth becomes clear when one considers that the on-resistance of M11 and the hold capacitor (DAC) form a low pass RC filter. Therefore, the maximum allowed switch resistance for charging the DAC capacitors within half the sampling period with a maximum error of VLSB, as analyzed in [13], is constrained by

$$\frac{1}{{2^{N + 1} }} \le e^{{\frac{{ - 0.5T_{S} }}{{R_{ON} C_{H} }}}}$$
(4)

where TS is the sampling period and RON is the on-resistance of the switch.

Transistors M1-M4 and both inverter logic gates of the bootstrapped switch were sized after the choosing of C1, C2 and C3, because they must deal with the charging/discharging of the capacitors within half TS. To minimize errors, the capacitors must be relatively larger than the parasitic capacitances added by transistors connected to their terminals. The remaining transistors were sized according to the required strength to drive the gate of other small size transistors within a short time interval.

2.3 The dynamic comparator

The comparator circuit design plays an important role in the system performance. For typical N-bit synchronous SAR ADCs, the N iterations are controlled by the SAR iteration clock, which presents its period N times lower than the sampling clock period. The time budget for the comparator to present a valid response after the start of a comparison, the regeneration time, is usually limited to half period of the SAR iteration clock. The other half period of the iteration clock is applied to reset the comparator and to settle the DAC voltage nodes during the charge redistribution.

An asynchronous converter does not present a SAR iteration clock and the sampling period is usually used half to do the input sampling and half to do the conversion itself. During the conversion, the comparator should perform N decisions, and the available time is allocated to each comparison by the circuitry according to the greater or lesser difficulty in carrying it out. In consequence, more time can be employed in the harder comparisons. As shown in [10], this feature reduces the data Code Error Rate of the converter, CER, due to the exponential reduction, when compared with its synchronous counterpart of same resolution, in the probability of a metastability event occurring in the conversion of a sample, Pmeta. For the asynchronous SAR ADC, we have

$$P_{meta} = 2^{N + 1} e^{{ - \left( {T_{S} - t_{track} - T_{FIX,TOT} - T_{easy} } \right)/\tau }}$$
(5)

where TS is the sampling clock period, ttrack is the sampling phase, TFIX,TOT is the sum of all fix delays for DAC settling and comparator reset, Teasy is the sum of all regeneration times, excluding the hardest one (treg,hard), and τ is the regeneration time constant of the dynamic comparator.

Note that Teasy also depends on the regeneration time constant and its value is approximately

$$\mathop T\nolimits_{easy} \approx \tau \sum\limits_{{i \in \left\{ {N - 1 softest comparisons} \right\}}} {\ln \left( {\frac{{V_{DD} }}{{2 | \Delta V_{i}| }}} \right) }$$
(6)

where ΔVi = (vin_p − vin_n) is the input voltage of the comparator in the i-th iteration.

Once the value of TS was chosen, the best way to keep Pmeta and CER low is through the reduction of τ. The dynamic comparator topology of [15] was considered in this work since it presents a low regeneration time constant and, as an additional advantage, a reduced kickback noise. The comparator schematic is shown in Fig. 7. In this circuit the comparator itself is composed by transistors M1 up to M14. The logic gates at the output stage of the comparator are responsible for delivering the comparison result (comp_out_n) and the ready indication flag (comp_ready_n) to other circuits.

Fig. 7
figure 7

The comparator schematic

The comparator works as follows: when clkc node is set to high logic level, the circuit is at reset phase, which turns on the transistors M11 and M12 and turns off transistors M7 and M8, setting both output voltages, voutp and voutn, to VDD. After these nodes reaching VDD, the ready_n signal is set to high, signalizing the end of the reset, and is fed back to switches M3 and M4, keeping them on. The comp_ready_n is a copy of ready_n, with exception that it is forced to “0” by eoc_n, the inverted version of eoc, when ready_n is locked at “1” due to a metastability event. This action is necessary for a proper resetting of the asynchronous logic, which depends on comp_ready_n, before the next conversion. Transistors M13 and M14 were added to discharge the parasitic capacitors of nodes ap2 and an2, ensuring they will present the same voltage level before the next comparison.

When clkc node is set to low logic level, the regeneration phase is triggered, meaning that transistors M11, M12, M13 and M14 are turned off, transistors M7 and M8 are turned on, and the output voltages will fall from VDD according to the current drained by the input differential pair M1 and M2, which depends on (vinp − vinn). The cross-coupled transistors M5/M6 and M9/M10 further amplify the differential input, increasing the difference among voutp and voutn. As soon the differential output voltage presents enough magnitude to trigger the output logic circuits, ready_n and comp_ready_n nodes are set to low, signalizing the end of the comparison, and the switches M3 and M4 are open, disconnecting M1 and M2 differential pair from the rest of the comparator circuit. Once the drain terminals of the differential pair will not sense the voltage swing driven by the cross-coupled pairs, this topology will reduce the M1 and M2 gate-drain voltage swing, therefore, reducing noise coupling between ap2/an2 and vinp/vinn through the gate-drain capacitances CGD of M1 and M2 (kickback noise). Note that M3 and M4 are open during the final part of the regeneration phase and most of the reset phase.

The dimensions of the transistors were chosen considering the regeneration time constant and mismatches. At first analysis, the mismatches, and the offset generated by them, is not a critical concern. In fact, the latch offset, when the common-mode voltage is kept constant, is not affected by the input values and, in consequence, is almost time invariant and slightly influence the ADC linearity [16]. Still, an exaggerated offset could decrease the signal-to-noise ratio (SNR) by reducing the input voltage swing [13]. Therefore, the differential pair transistors were enlarged to reduce the mismatch, being their size limited by the kickback noise problem. The remaining transistors, on the other hand, were made as small as possible to reduce their parasitic capacitances, fulfilling the time constraints of the system. The reduction of the NMOS transistors were limited, for this design, by the minimal enclosure space required for the drain and source regions to hold at least 2 diffusion contacts, whereas the width of the PMOS transistors were limited to 4 diffusion contacts, as a compensation for the reduced mobility of their majority carriers. This care was taken to increase the yield. Finally, buffers in the output stage were added to mitigate the coupling from comp_out_n node into voutp and voutn nodes during the regeneration phase. The buffers were provided by the standard cells library.

2.4 The asynchronous logic

The SAR logic circuit is responsible for providing the switching sequence for the DAC circuit. In this work, a custom combinational asynchronous SAR logic, based on the circuit described in [8], with modifications to support the merged capacitor switching scheme was employed. This asynchronous logic allows a “manual” device placement, presents reduced power consumption and makes use of only one clock source.

Figure 8 presents the SAR logic circuit block diagram, and Figs. 9, 10, and 11 present, respectively, the implementation of the Main Control, the DAC Control and the Comparator Control circuits.

Fig. 8
figure 8

The SAR logic block diagram (some signals are omitted for simplicity)

Fig. 9
figure 9

The main control circuit

Fig. 10
figure 10

The DAC control circuit

Fig. 11
figure 11

The comparator control circuit

The Main Control circuit has four input signals, coming from the preceding Main Control circuit, from the Comparator, from the DAC Switch, and from the General Control, and two output signals, going to the next Main Control circuit, to the DAC Control, to the next DAC Control, to the DAC Switch, and to the Comparator. At the beginning of the comparison phase, the node states of the Main Control circuit are: eoc_n = “1”, bit_set = “0”, comp_ready_n = “1”, bit_ready = “0”, compare_n = “1”, and next_bit_set = “0”. The Main Control k is ready for operation when its bit_set is set to high. After that, the bit_ready is also set to high by the DAC Switch k, signaling that the previous DAC charge redistribution was concluded. The Main Control k will then drive the node compare_n to low, and that will start, through the Comparator Control, a new comparison. When the comparison is ready, the comp_ready_n is set to low, causing the change in the value of next_bit_set. Now, the next_bit_set has a high level, and it will reset the comparator, through the Comparator Control, will release the DAC Control k circuit to store the comparator result, and will initiate a new DAC charge redistribution and the operation of the Main Control (k − 1), through the DAC Switch (k − 1).

The DAC Control circuit has six input signals and one output signal. At the beginning of the comparison phase, the node states of this circuit are: eoc_n = “1”, bit_set = “0”, comp_ready_n = “1”, compare_n = “1”, next_bit_set = “0”, and bit_n = “1”. The DAC Control k operation also starts with bit_set = “1”, discharging the node bit_n. When the comp_ready_n is set to low after the end of a comparison, and while the signal next_bit_set is kept low, the DAC Control will dynamically register the current comparator result in bit_n. This signal is sent to the DAC Switch (k − 1) to be used in the switches’ selection.

To drive the bottom plate of the capacitors in the DAC array to either VREFP or VREFN, inverters as switches were applied in the proposal of [8]. In this work, the adoption of the MCS required the use of three-level voltage switches for each bit in the DAC. Therefore, the DAC Switch (k − 1) must initially short the bottom plates of the capacitors to VCM and then, according to each comparison result, switch Cpk-1 and Cnk-1, respectively, to either VREFN and VREFP or VREFP and VREFN. In Fig. 12, it is illustrated the implementation of the DAC Switch for each single bit. The DAC Switch has three input signals, bit, bit_set_dly, and eoc_n, and one output signal, bit_ready, which is responsible for signalizing the asynchronous circuit that the DAC has settled. The DAC switch also generates six internal signals, s0, s0_n, s1, s1_n, s2, s2_n, which control the capacitive array switches. The bit input signal of the DAC Switch (k − 1) is provided by inverting the bit_n output of the DAC Control k circuit. This was accomplished by inverter logic gates, which also provide buffering and protection to the high impedance bit_n nodes from the noise coupled in the long interconnections going from the DAC Control to the DAC Switches and Output Registers. At the beginning of the comparison phase, the bit_set_dly state is “0” and only the switches connecting to VCM are closed. When the comparison is ready, the Main Control k set the bit_set_dly of the DAC switch (k − 1) to high, which will open the VCM switches. Now the two NAND gates will prepare the signals controlling the other switches, according to the value of the node bit, which came from the DAC Control k. The utilization of the delayed version of next_bit_set, bit_set_dly, will provide a longer time for the preparation of the node bit and ensure that this signal is stable when s0_n is set to high. This will avoid glitches in the NAND gates. The signal bit_ready, sent to the Main Control (k − 1), is set to high after the charge redistribution is completed.

Fig. 12
figure 12

The DAC switch circuit

Note that, due to the careful design of the DAC Switch Control logic, the nodes s0, s1, and s2 are never set to high at the same time, avoiding temporary short circuits between VCM and VREFN or VREFP power sources.

The complete SAR logic block works as follow: during the sampling phase of the ADC, the sample_clk signal has low logic level and no activity is observed in the circuit. When the sample_clk signal has high logic level, the asynchronous circuit is triggered and, from now on, N iterations will take place. In the first iteration, the General Control presets the Main Control 9 (1), bit_set9 = “1”, and waits for the input signals of the comparator to settle, after the opening of the bootstrapped switches. Once the signals are settled (2), bit_ready10 = “1”, the Main Control 9 requests a comparison (3) to the Comparator Control, compare_n9 = “0”, which, in turn, sets the dynamic latch comparator to the regeneration phase (4), clkc = “0”. When the result is available in the output of the comparator (5), the Main Control 9 and DAC Control 9 blocks are signalized (6) with comp_ready_n = “0”. After the comparison, bit_set8 rises (7) to preset the Main Control 8 and DAC Control 8. Also a delayed and buffered version of bit_set8, bit_set_dly8 (9), is used to lock the DAC Control 9 (next_bit_set) after the settling of bit9, to reset the Comparator Control and to trigger the DAC Switch 8 and Output Register 9. Now, bit9 (8) holds the actual 10th bit of the converted signal, which will be used by the Output Register 9 and DAC Switch 8 circuits. This circuit will determine, according to the value of bit9, to which reference voltage the bottom plate of the MSB capacitors of the capacitive array should be switched. Finally, after a fixed delay, the bit_ready9 is set to high (10), and the Main Control 8 will initialize the next iteration. The Main Control 8 and the DAC Control 8 will repeat the operations done by the DAC Control 9 and the Main Control 9 and will obtain the 9th bit. The sequence of iterations will continue until all bits are obtained. Once the conversion is completed, the ok_dly signal is fed back to the General Control, the eoc flag will rise and all the circuits of the SAR logic Block will be reset. It should be mentioned that the bit_n0 holds the first bit of the converted signal and does not trigger any switching in the DAC.

3 Layout considerations

The layout of mixed-signal integrated circuits usually has its analog circuitry placed as far as possible from its digital circuitry and presents dedicated supply voltages for each domain, to reduce the noise coupled into sensitive areas. The ADC was laid out in a 65-nm CMOS technology following these guidelines and is shown in Fig. 13.

Fig. 13
figure 13

Layout of the ADC

In the layout, the bootstrapped switches and the comparator were placed at the leftmost side of the chip to reduce the routing length of the sensitive wires vinp, vinn, vdac_p and vdac_n. The wires vinp and vinn had their widths increased to provide a low resistance path, reducing the amount of charge injected into the hold capacitor when the bootstrapped switches are opened. In addition, vdac_p and vdac_n connections were carefully analyzed and placed to reduce their parasitic capacitances that are the main cause of offset errors. In order to keep the power supply and reference nodes as steady as possible, decoupling capacitors (Decaps) were added to these nodes, at the rightmost side of the chip.

The symmetry in the layout of a differential ADC can affect the system linearity. So, all circuits that appear in pairs, like the bootstrapped switches, capacitive arrays, DAC switches and Decaps, were symmetrically placed along the horizontal axis. Although the comparator presents itself as a single circuit, any asymmetry in its implementation directly affects its offset voltage. So, a meticulous layout drawing of this circuit was performed.

To greatly reduce the congestion due to digital signals routing, the digital circuitry was placed along the horizontal axis in sequential bit order. Following this alignment, a row of DAC switches was placed near each capacitor array, permitting the creation of a pair of reference voltage buses from the Decaps throughout the DAC switches.

Additionally, for each bit, some delay cells were applied to ensure proper delay is placed to the charge redistributions of the DAC. For the least significant bits, however, delay cells were not necessary because the delay of the logic circuits is enough to ensure the small amount of charge is properly redistributed before the next comparison.

Finally, to further isolate sensitive circuitries, some transistors were implemented within a Deep N-Well guard ring. This careful procedure is applied to the transistors of the bootstrapped switch, except for the transistors related to the voltage doubler circuits, and the transistors of the comparator circuit.

4 Simulation results

The performance of the ADC was verified by running post-layout simulations with Cadence Spectre simulator. To estimate static parameters, such as Differential Nonlinearity and Integral Nonlinearity (INL), the code density analysis is applied [2]. In this work, it was performed by running a transient simulation where the ADC converted 10,240 samples of a differential ramp input signal (vinp increases from zero up to the full-scale voltage, VDD, and vinn decreases from the full-scale voltage down to zero). The resulted code density was analyzed to extract the DNL and INL, which are shown in Fig. 14. For typical corner model and VDD = 1.2 V, the worst DNL and INL errors were found to be, respectively, 0.52 LSB and 0.53 LSB. The results of this transient simulation were also used to plot the actual transfer curve of the ADC, in which it was possible to obtain parameters like gain and offset errors. The calculated gain error is 18.1 LSB (less than 1.8% of the full-scale range) whereas the calculated offset error is − 11.1 LSB (less than 1.1% of the full-scale range).

Fig. 14
figure 14

DNL/INL vs. output code

Parameters like ENOB and power consumption were extracted by running transient simulations, where 1024 samples of a differential sinusoidal input signal are converted and analyzed. For each simulation, the differential input signal frequency is adjusted to be slightly lower than half the sampling frequency (Nyquist criteria), the peak-to-peak voltage to be 90% of input range (to avoid saturation), and the common-mode voltage to be 0.6 V (VCM). The frequency spectrum analysis of the converted codes reveals parameters like signal-to-noise and distortion ratio (SNDR), required for the ENOB calculation [2]. Although the amplitude reduction of the input signal decreases the SNDR, and consequently the ENOB, this effect is compensated by applying a correction factor [17] in the ENOB to properly estimate its value. By using the abovementioned setup, a set of simulations with different sampling frequencies was run to observe the range in which the system keeps a satisfactory performance. For each simulation, a precise frequency for the input signal was calculated for coherent sampling. For typical corner model and VDD = 1.2 V, the obtained results are shown in Table 1. The table shows that the ADC can operate with sampling rate ranging from 10 kS/s up to 12 MS/s with practically constant ENOB and reduced power consumption, confirming that the ADC is suitable for a wide spectrum of low power applications.

Table 1 Performance of the SAR ADC for different sampling frequencies

As expected, the simulated power consumption of the circuit presented an almost linear dependency of the sampling frequency. By drawing the Fig. 15, the coefficient “Power/Sampling frequency” is observed to be roughly constant for sampling rates ranging from 0.1 up to 12.0 MS/s.

Fig. 15
figure 15

Simulated average power consumption vs. sampling frequency

Figure 16 shows the current consumption from the several power sources applied in the ADC. The power sources are: the digital power supply (vddd), the analog power supply (vdda), the positive reference power supply (vrefp), the commom-mode power supply (vcm) and the input power sources (vinp and vinn). The biggest contributions to the power consumption are given by the digital circuitry (current sinking from vddd) and by the capacitive DAC (current sinking from vrefp). Some of the digital circuitry was implemented with cells from the foundry standard digital library, which means that better power consumption could be achieved if only custom cells were employed. Additionally, the applied merged capacitor switching scheme [12] reduced the current sink from vrefp and kept low the power consumption wasted with the DAC charge distribution.

Fig. 16
figure 16

ADC average current consumption @12MS/s

The method used for estimation of the Pmeta and the CER due to the metastability was based on [18]. By running a set of transient simulations of the comparator, each with different constant input voltages ΔVi,k = vinpkvinnk, the τ parameter can be extracted by looking at the different regeneration times of the comparator for each case. If the set of input voltages are arranged in a geometric progression with a common ratio of 1/10, we have ΔVi,k = ΔVi,k-1/10 and Eq. (6) can be used to calculate

$$\tau \approx \frac{{t_{p,k} - t_{p,k - 1} }}{\ln 10}$$
(7)

where tp,k and tp,k-1 are the regeneration time for the input ΔVi,k e ΔVi,k-1, respectively.

For typical corner model and VDD = 1.2 V, post-layout simulations of the comparator show that τ = 29 ps and the simulations of the ADC at 12.0 MS/s show that tmargin = (TS − ttrack − TFIX,TOT − Teasy) are close to 17.4 ns. With these parameters, we find that Pmeta has an extremely small value, on the order of 10–258 error/sample, leading to an irrelevant CER.

The simulations for extraction of DNL/INL, ENOB and CER were repeated considering process (Typical, Fast–Fast and Slow-Slow), supply voltage (1.08 V, 1.2 V, and 1.32 V), and temperature (0 °C, 27 °C, and 85 °C) variations (PVT) to evaluate the behavior of the circuit under these different circumstances. In the worst cases of PVT variations, at 12.0 MS/s, the circuit presented a 9.44 ENOB (@FF, 1.32 V, 0 °C), a power consumption of 228 μW (@FF, 1.32 V, 85 °C) and a CER of 4.95⋅10–9 (@SS, 1.08 V, 85 °C).

The CER presented itself as the most sensitive parameter because the exponent of Eq. (5), − tmargin/τ, varies significantly for different PVT conditions. In Table 2 it is shown the typical and the slow-slow, (SS), 85 °C results for different supply voltage conditions. Notice that the exponent can change from − 601, in the typical case, to − 26.7, the highest valor obtained among all tested conditions. This wide variation in the exponent will cause, in turn, a huge variation in the CER value, which goes from 10–258 to 10–9. Even so, the obtained CERs are compatible with the necessary Bit Error Rate (BER) reported in the literature for low power Zigbee and BLE frontends [19, 20], ranging from 10–6 up to 10–3. For more strict applications, the CER of the ADC can be greatly reduced by limiting the supply voltage to VDD ± 5%. In such circumstance, the worst PVT condition is (@SS, 1.14 V, 85 ºC) with a CER as low as 1.92⋅10–59.

Table 2 CER of the ADC for different supply voltages, at typical and worst corners

A summary of the simulation results of this design is shown in Table 3, alongside other published work measurement results for similar topology, resolution and fabrication process.

Table 3 Performance comparison of similar SAR ADCs

5 Conclusions

A low power, differential, asynchronous 10-bit SAR ADC was successfully designed, drawn to layout and simulated. In the topology, it is employed split binary-weighted capacitive arrays, to reduce the circuit area, and the merged capacitor switching scheme, to reduce the power consumption. Improved switches were designed for the sample-hold circuit to warrant the linearity of the ADC. Finally, a reduced kickback noise comparator was applied, which presented low power consumption, although additional transistors were required to discharge internal nodes. All the modifications resulted in an ADC with a performance comparable with the best circuits of the literature. In post-layout simulations, at maximum sampling rate of 12.0 MS/s, the circuit achieved an ENOB of 9.65 and a power consumption of 151.4 μW, leading to a Figure of Merit of 15.8 fJ/Conversion-step; at 10.0 kS/s sampling rate, the ENOB is almost the same, 9.63, but the power consumption is cut to only 0.26 μW. These results show that the implemented ADC can be employed in a wide range of low power, moderate resolution systems as expected.