1 Introduction to open-source IC design

Free and open-source software (FOSS) for the design of integrated circuits (IC) has been around for decades. The most prominent specimen, the Simulation Program with Integrated Circuit Emphasis (SPICE), was one of the first simulators for electronic circuits in widespread use and is still the foundation for most of today’s commercial and FOSS simulators. Since then, FOSS for IC design has been a marginal phenomenon, that was mostly relevant for academic research.

Recently, FOSS for IC design has experienced a renaissance, with efforts of companies like Alphabet (Google) and a considerable community of enthusiasts from academia and hobbyists, as well as commercial entities trying to simplify the access and usage of a stack of tools for digital and analog design. Sponsored or low-cost multi-project wafer runs allow private individuals, low-budget academic research, and startup companies to enter IC design, explore new creative ideas, and lower the entrance barrier to this otherwise budget-intensive topic.

All these advances have led to a new spark in the FOSS IC design area, which continuously picks up pace. Multiple complete RTL-to-GDS flows for digital design are available, combining many tools, and the tools for analog design are also maturing. Finally, two open-source Process Design Kits (PDK) (and more are announced) allow everyone to access the information required to design a chip.

Besides the benefit in IC research and development of easily shareable designs and results in scientific papers (like publishing the full design files of a circuit together with the usual measurement results), these open-source EDA stacks allow the use in teaching without the necessity of SW licenses, PDK NDAs, and the mandatory (but hindering) access restrictions for IP protection.

2 Open-source IC design software stack and usage

The usage of open-source software in the construction of IC is not just a recent phenomenon. Quite the contrary, many important developments in the IC area started at university research labs many years ago, and the resulting software has been made publicly available under open-source licenses. The venerable SPICE [20] or the layout editing tool Magic [25] are just two examples (and interestingly, both are still in use today).

Furthermore, freely available open-source PDKs are not a recent development either, with examples like ASAP7 [7] being quite well-known and often used in research papers. However, the (until recently) available PDKs like ASAP7 are theoretical PDKs, which means that (at best) simulation data can be created and a layout file (GDS) can be produced, but no IC can be actually manufactured from these efforts.

This situation changed drastically in 2020 when SkyWater Technologies and Google released the first open-source manufacturable PDK [41]. This swung open the door to the possibility of having an actual IC manufactured in a mature node (130 nm), designed exclusively with open-source design tools, and with access to a PDK and process documentation without the need for signing a non-disclosure agreement (NDA).

However, one caveat is that to arrive at a fully functional design flow, many SW packages must be downloaded, locally compiled, package dependencies resolved, configurations set correctly, and so on—all this without a professional support hotline. While it is known that electronic design automation (EDA) flows for IC are complex, this additional layer of difficulty is a significant entry barrier for newcomers. To substantially lower this barrier, packaged, configured, and tested environments like the IIC-OSIC-TOOLS [27] are available; this SW stack has also been used in the development of this ADC.Footnote 1

Table 1 lists the used SW components of this environment for the presented mixed-signal design, while many more are available, as documented in the README of [27]. As can be seen, a full custom (analog) design flow is supported spanning from circuit entry, via simulation, to layout design and parasitic extraction. Likewise, a digital RTL-to-GDS flow is supported using OpenLane/OpenROAD [2, 33], which allows the effective implementation of large digital circuits [11].

Table 1 Open-Source tools and PDK available in IIC-OSIC-TOOLS used in this ADC

Fig. 1 depicts how the individual building blocks are constructed and assembled into the final top-level block. Custom analog blocks like the capacitor matrix, comparator, and the \(V_{\mathrm{CM}}\)-generating charge pump consist of a schematic SPICE netlist (used for simulation and LVS), which is created using Xschem and a layout view created in Magic. Magic can also write an LEF- and a GDS-file, which, together with a Verilog stub, is used for top-level assembly using OpenLane/OpenROAD.

Fig. 1
figure 1

Block diagram of the proposed open-source design flow, including the essential tools and used/generated files

With the same tool set, custom digital standard cells can also be created, which has been used for creating dedicated large-delay cells, as the available delay cells in the SKY130 libraries proved inadequate for this design. Behavioral Verilog models are synthesized and tech-mapped into a SKY130 gate-level representation using Yosys [40], and all subcomponents (custom analog, custom digital, and standard digital) are automatically placed and routed using OpenLane/OpenROAD, requiring only minimal manual intervention.

3 Circuit design

As shown in Fig. 2, the presented SAR-ADC consists of segmented capacitive digital-to-analog converters (DAC), a self-clocking mechanism, an embedded switched-capacitor voltage divider for common-mode voltage generation, a fully dynamic two-stage latch comparator as proposed in [12] and a digital control block with an integrated oversampling decimation filter. However, this paper focuses on the DAC, the self-clocking mechanism, and the digital control block.

Fig. 2
figure 2

Block diagram of the proposed non-binary SAR-ADC with self-clocking mechanism, embedded common-mode voltage generation, and highly flexible integrated decimation filtering [19]

The internal clock generation is based on custom-designed delay cells in the form factor of a SKY130 digital standard cell. This allows the parametrization of an analog circuit in a hardware description language (HDL) and hardening using an unmodified digital design workflow.

The proposed SAR-ADC has a physical resolution of 12 Bit, which can be increased up to 16 Bit by oversampling followed by a boxcar decimation filter with a selectable oversampling rate (OSR) up to 256. The four least significant SAR weights include a digital averaging filter to reduce the sampled noise. Post-layout simulations reveal a maximum sampling rate of 1.44 MS/s with nominal process parameters. Sources, schematics, and the final layout are published on GitHub [18] within the Apache‑2.0 license.

3.1 Digital-to-analog converter (DAC) matrix

Due to its simplicity and power efficiency, a 12-bit digital-to-analog converter based on charge-redistribution [37] is used for the proposed SAR-ADC. In this case, the input signal is sampled onto a capacitive array using the top-plate sampling method. As shown in Fig. 3, the DAC matrix is segmented into 511 cells of a 9-bit thermometer code and three cells of a 3-bit binary code [16] for area-efficiency and improved linearity. In addition, an arrangement of the whole DAC matrix with fairly equal capacitor cells reduces offset and mismatch errors. A single thermometer cell consists of \(8C\) unit capacitors (whereas \(C^{0}=447\,{\text{aF}}\)) in an area of 25 \(\upmu\)m\({}^{2}\). The binary cells are constructed from a thermometer cell with binary weighted unit capacitors of \(1C\)/\(2C\)/\(4C\).

Fig. 3
figure 3

Assembly of the DAC-matrix. 511 thermometer and three binary cells allow addressing 4095 waffle-capacitors. Additionally, 18 drive-, 77 dummy-, and one gate-cell (S&H switch) complete the DAC

3.1.1 Layer stack considerations

The number and properties of available routing layers in the SKY130 layer stack [35] must be considered in the early phase of system-level planning. The SKY130 PDK is limited to one local-interconnect-layer li and 5 metal routing-layers m1m5. Since the design rules were disadvantageous for an area-efficient design using layer m5, it was fully dedicated to the top-level power distribution network (PDN). In this case, layers m2, m3, and m4 are used for the capacitor layout.

Since the capacitor is realized using metal layers m24, the routing of the decoder circuit as shown in Fig. 6 must be done in the residual layers li and m1. Using the local interconnect layer for routing is a compromise to allow the waffle capacitor design on top of the decoder to use three metal layers for best matching performance, but the higher sheet resistance of \(R_{\mathrm{s,li}}=12.8\,{\mathrm{\Omega}/\Box}\) (in comparison to metal layers) could become the limiting factor regarding DAC speed.

Fig. 4
figure 4

a Layout top-view of the core- (thermometer-) cell waffle-capacitor with 8 unit-capacitors (\(8C\)) for reference. b Binary cell with 4 unit capacitors (\(4C\)). c Binary cell with 2 unit capacitors (\(2C\)). d Binary cell with 1 unit capacitor (\(1C\))

Fig. 5
figure 5

Layouts of the investigated capacitor topologies. Topology a and b are waffle-capacitor based, d–h are inspired by finger-capacitors, and c is a hybrid of both structures. The main capacitance of structures a–d, g is in layer m3-m3, structures e and f use m3-m4 while the main capacitance of h is located in m4-m4 [19]

Fig. 6
figure 6

Schematic of the DAC drive cell complementary sample signal generator [28] and the row/column decoder circuit [31], which are both integrated into the DAC cells

3.1.2 Capacitor topology evaluation

For evaluation of matching and capacitance values in different types of thermometer- and binary-code capacitors, eight different finger- and waffle-capacitor structures, as shown in Fig. 5, were compared with a focus on matching, capacitance, and area. Table 2 shows the cell capacitance \(C_{\mathrm{cell}}\) and the unit cap capacitance of the resulting binary cells \(C^{0}_{\mathrm{1-16}}\). Structures a and b of Fig. 5 show the most suitable compromise concerning matching, capacitance density, and parasitics. The mismatch in the binary capacitor cells of setup c is the highest, resulting in an unreasonable use of binary-coded DAC cells. The matching in the binary cells of structure d is the best, but the gain error is increased due to additional parasitic capacitance from the top plate to the shield.

Table 2 Extracted capacitance values of the investigated topologies

Structures e and f are no candidates for this design since the small unit capacitors show high variance. The structure g shows slightly more capacitance density than a or b, but in addition, the variance in the binary caps is also high. The last structure h shows a possible option with reasonable size and matching. As a result, the waffle capacitors of structures a and b were selected as the best topology for the proposed SAR-ADC. The result is a DAC capacitor cell with 8 unit capacitors using a waffle structure and minimum unit capacitance \(C^{0}=447\,{\text{a}\text{F}}\).

3.1.3 Semi-differential charge compensation

The capacitor is top-plate sampled, as a result, the top layer of the capacitor is critical for matching and parasitics. On the other hand, the capacitor bottom plate’s potential is well-defined in this topology, parasitic capacitance to the bottom plate does not affect matching. Parasitic capacitance to the capacitor top plate can add or pull charge from the DAC capacitor. Constant parasitics to the capacitor top-plate, e.g., to \(V_{\mathrm{DD}}\) or \(V_{\mathrm{SS}}\), add gain error which can be easily compensated. Parasitics to dynamic analog or digital signals add errors that cannot easily be corrected. An additional shielding layer between the capacitor bottom-plate m2 and the decoder routing layer m1 would be highly preferable to shield parasitic capacitance between the top plate and the dynamic decoder control signals. Still, the lack of available routing layers does not allow adding such a layer in this configuration.

However, simulations have shown that the gaps between the DAC capacitor cell layouts, in the assembled 12-bit DAC matrix, expose the top plate to a critical amount of parasitic capacitance, and it is necessary to compensate for this capacitance. Semi-differentialFootnote 2 wiring of the row and column control wires was introduced in the layout. The active-low column control signal \(\overline{\mathtt{col}}\) has been extended by the active-high signal col. The row control signals \(\overline{\mathtt{row}}\) and \(\overline{\mathtt{rowon}}\) have been extended by signal \(\overline{\mathtt{rowoff}}\) with the limitation that exactly one signal is at potential 0 V (active). In contrast, the other two signals must be set to 1.8 V (inactive).

Parasitic extraction with Magic has been used to match the parasitic capacitance between each control signal and the DAC capacitor top plate. If a control signal changes its state, the injected charge is moved to the semi-differential wire instead, and the net charge injection on the capacitor top plate is compensated. This adds complexity to the layout and the digital logic, and this compromise increases the total parasitic capacitance to the capacitor top plate; however, the initially dynamic error is shifted into the gain error.

3.1.4 Complementary sample signal generation

The row and column control signals are logically evaluated as single-ended signals, however, the pass gates in the decoder circuit as seen in Fig. 6 are driven by the differential signals sample_dac and \(\overline{\mathtt{sample\_dac}}\). The complementary pass gates are sensitive to asymmetric switching of the control signals: If a state transition of the signals sample_dac and \(\overline{\mathtt{sample\_dac}}\) are not symmetric, then \(V_{\mathrm{CM}}\) and vdrv can be shorted, which negatively influences the voltage of \(V_{\mathrm{CM}}\).

Post-layout simulation has revealed that asymmetric switching is present if the complementary signals are generated in the digital domain; hence, a redesigned signal generation has been implemented in the analog domain. The preferred solution for future designs would be a non-overlapping clock generator for both pass gates to prevent simultaneous on-states of the gates. However, to avoid the routing overhead of additional lines, the complementary sample signal generator shown in Fig. 6 has been implemented in the DAC matrix rows to generate moderately symmetric complementary sample signals [28]. Generating an inverted signal using a single inverter adds the signal transition time of the inverter to the inverted signal, therefore, the complementary signal transitions are unsymmetrical. The implemented circuit uses the signal transition delay of a non-inverting always-on pass gate to generate a similar signal transition delay in the non-inverted signal. Both signals are then inverted again to gain similar output drivers.

3.2 Clock generator

Two clock sources are necessary for the proposed ADC, one slow clock for \(V_{\mathrm{CM}}\) generation, and a fast clock signal \(\mathtt{clk\_digital}\) to control the SAR-ADC logic and the comparator. This fast clock is asynchronously generated on-chip [6] using the ring-oscillator principle, shown in Fig. 7. The clock is only active while the ADC is running to save energy.

Fig. 7
figure 7

Block diagram of the self-clocking ring-oscillator loop in the ADC [6, 30]

The loop generates a clock signal with a period that corresponds to twice the transition time through three delay cells and the comparator conversion delay (the delays of the inverters and NOR gates are negligible). In this case, programmable delay cells (delay modules) are used to adjust the clock frequency to allow enough settling time for charge redistribution after switching operations and comparison. The components X7 and X8 form an edge-detect circuit to suppress multiple conversions from a single positive signal edge of the trigger signal \(\mathtt{start\_conversion}\). In addition, standard cells were used as far as possible to minimize the layout effort and simplify migration to other process nodes.

As shown in Fig. 8, the delay module comprises only standard cells. Organizing the delay chains into \(N\) binary steps (\(2^{0}\cdot 5\,{\text{n}\text{s}}\), \(2^{1}\cdot 5\,{\text{n}\text{s}}\), \(\ldots\), \(2^{N-1}\cdot 5\,{\text{n}\text{s}}\)) allows hardware-efficient configuration using \(N\) configuration wires. AND-gates before the delay inputs are used to bypass unused delay cells and thus reduce power consumption.

Fig. 8
figure 8

Delay module (automatically placed-and-routed, using a custom-made 5‑ns delay cell) with the circuit of the custom delay standard cell

The signal delay could be generated by exploiting the delay time of multiple buffers connected in series. The investigated SKY130 standard cells from the high-density (HD) library have shown a delay time in the order of \(360\,{\text{p}\text{s}}\) per cell. A custom delay standard cell was implemented to increase area efficiency with a delay time of \(5\,{\text{n}\text{s}}\) per cell. The circuit in the inset in Fig. 8 consists of a weak inverter, capacitive load, and a Schmitt-trigger circuit as the output stage. As the custom delay cell is designed using the form factor of a SKY130 digital standard cell, the layout of the delay module and the clock generator can be generated using a fully automated digital workflow using a Verilog gate-level description.

3.3 Digital control

The digital core module includes the DAC matrix row and column decoders, synchronous digital core logic, and asynchronous oversampling logic. The row and column decoder translates the current DAC code to the matrix row and column wire control signals [16]. It allows the selection of a meander or common-centroid activation styleFootnote 3, the implemented modes of operation can be seen in Fig. 9. The synchronous logic handles the non-binary SAR algorithm [14] with a non-binary ratio of \(1.65\), clock loop control, and the calculation of the 12-bit result with averaging of the four least significant SAR weights. The oversampling module, which is asynchronously clocked using a strobe signal, calculates the 16-bit oversampled result [17].

Fig. 9
figure 9

The implemented DAC matrix row/column decoder modes in dependence of the configuration bit row_mode and col_mode in a \(4\times 4\) matrix

The non-binary SAR algorithm in [22] was adapted for a fully differential SAR-ADC. \(M\) is the resolution in bit, \(\Theta[k]\in\mathbb{N}\) is the \(k^{\text{th}}\) weight of \(N\) weights while \(N\) is limited to \(\{N\in\mathbb{N}_{0}:N\geq M\}\), and \(s[k]\) is the \(k^{\text{th}}\) comparison result with the following conditional value:

$$s[k]=\begin{cases}+1,&\text{if }V_{\mathrm{ctop,p}}[k]-V_{\mathrm{ctop,n}}[k]> 0\\ -1,&\text{otherwise}\end{cases}$$
(1)
$$V_{\mathrm{ctop,p}}[k]=V_{\mathrm{ctop,p}}[0]+V_{\mathrm{DAC+}}[k]-V_{CM}$$
(2)
$$V_{\mathrm{ctop,n}}[k]=V_{\mathrm{ctop,n}}[0]+V_{\mathrm{DAC-}}[k]-V_{CM}$$
(3)

The binary DAC value \(d_{\mathrm{DAC\pm}}[k]\) determines the analog DAC reference voltage \(V_{\mathrm{DAC\pm}}[k]\) at the \(k^{\text{th}}\) step:

$$d_{\mathrm{DAC\pm}}[k]=2^{M-1}\mp\sum_{i=2}^{k}{s[i-1]\cdot\Theta[i]}$$
(4)

The chosen DAC weights \(\Theta[i]\) are {2048, 806, 486, 295, 180, 110, 67, 41, 25, 15, 9, 6, 4, 2, 1} with \(\sum\Theta[i]=2^{M}-1\) for \(M=12\). The result \(d\) for the non-binary \(M\)-bit SAR-ADC using \(N\) weights is in the set \(\{d\in\mathbb{N}_{0}:0\leq d\leq 2^{M}-1\}\) and is calculated using the generalized formula:

$$d=2^{M-1}+\sum_{i=2}^{N}{s[i-1]\cdot\Theta[i]}+\frac{1}{2}(s[N]-1)$$
(5)

3.4 Top-level hardening

As seen in Fig. 2, the separated analog and digital building blocks must be combined in a top-level cell. The current OpenLane chip integration documentation [23] specifies a macro hardening stage, integration of hardened macros into the chip core, and integration of the hardened core into the pad frame. Each hierarchical level reduces the highest allowed metal layer by one level in such a hierarchical design. In the chip core level, metal layer m5 is used for the PDN to connect macros to their power rails on m4. However, a macro inside another macro would be limited to the usage of m3 as the highest layer, and so on.

This limitation had to be overruled in this work because SKY130 MIMCAP layers use m4 at the lowest hierarchy. According to the documentation, the integration of a macro with MIMCAP layers, as seen in the comparator and \(V_{\mathrm{CM}}\) generator, must be done in the OpenLane core hierarchy. A method has been found to integrate macros at the same hierarchical metal layer as the current PDN generation. The ADC hard-macro IP is hardened as a chip-core using metal layers up to m5 in a way that allows integration into another chip core at metal layer m5. However, if done correctly, this method violates the documented rules for chip integration, not the process’s design rules. The PDN generator has been found to avoid short circuits and connect macros to the PDN if PORT layers in the LEF file at the PDN metal layers m4‑5 are also defined in the FP_PDN_MACRO_HOOKS configuration of OpenLane, all other nets specified in the LEF file at the PDN layers must be protected by obstruction OBS layers. For example, these obstruction layers have been used in the DAC matrix to prevent the generation of m5 structures above the DAC top plate capacitor. The hardened SAR-ADC macro is surrounded by a core ring, which allows power connections of the SAR-ADC to the top-level power network at the edge of the IP block.

OpenLane has been configured to generate the SAR-ADC top-level hard-macro IP using a digital-on-top workflow, the different hardening stages are shown in Fig. 10 using OpenROAD. The workflow utilizes the GDSII and LEF file format as input for the custom analog macro cells (DAC, comparator, \(V_{\mathrm{CM}}\) generator, clock generator), and the digital RTL is synthesized, as is shown in Fig. 1. The macros are placed on the top-level macro area by manually specified coordinates at fixed core dimensions. The digital standard cell grid is generated around the placed macros on unused space. The analog macros and the digital standard cell grid are connected to the generated PDN on m4 and m5. This work uses a vertical m4 and horizontal m5 PDN. For the clock generator, a custom PDN generation Tcl script was used to allow connection from m4 down to m3, since the power rails of the clock generator have been placed on level m3. After PDN generation, the clock tree is synthesized, standard cells are placed on the standard cell grid, and routing is performed. After several rounds of optimization, the flow is completed, and the top-level GDS and LEF are generated. Since the OpenLane workflow was initially intended for digital workflows, the layout is optimized manually using KLayout for cleaner and wider analog signal routes.

Fig. 10
figure 10

Layout exploration using OpenROAD. Figure left to right: manual macro placement; power distribution network and standard cell power grid included; routing included

4 Simulation results

The transfer function of the assembled DAC before and after post-layout parasitic \(C\)-extraction using Magic is shown in Fig. 11. The simulator ngspice has been used to run the simulation. Fig. 11a shows the inherently good linearity of the capacitive charge-redistribution DAC topology, but the full-scale range is limited to 0.167–1.672 V due to gain- and offset-errors. The zoomed DAC transfer function in Fig. 11b shows the mismatch in the LSB bits. The top-level post-layout SPICE netlist has been generated using the parasitic \(C\)-extraction from the VLSI layout tool Magic with the configuration ext2spice cthresh 0.1. The resulting netlist contains 26e3 MOSFETs and 60e3 capacitors. The schematic tool Xschem has been used to build a test environment with disabled averaging, no oversampling, and the fastest clock frequency. The A/D conversion has been simulated using the parallel simulator Xyce at precision settings ABSTOL=1E-15 and RELTOL=1E‑6. Although Monte Carlo mismatch simulations are possible by using the presented open-source tools, they could not be performed for the proposed SAR-ADC since the design is far too big, leading to a very long simulation time for a reasonable set of data points. However, the proposed SAR-ADC was proven by corner simulations. The plot in Fig. 12 shows a conversion time of \(T_{\mathrm{conv}}=694\,{\text{n}\text{s}}\) which corresponds to a sample rate of \(f_{s}=1.44\,{\text{M}\text{S}/\text{s}}\) at an average power consumption of \(703\,{\upmu\text{W}}\). Table 3 shows the comparable performance of this open-source SAR-ADC to the state-of-the-art, designed with commercial tools.

Fig. 11
figure 11

DAC matrix transfer curves of the pre-layout and post-layout simulation (the data vector sweeps from value 0 to 4095). a shows the full transfer curve, b is zoomed to DAC code 2048, showing the mismatch in the LSB bit translation

Fig. 12
figure 12

Post-layout simulation of an A/D conversion at the fastest configuration \(f_{\mathrm{s}}=1/694\,{\text{n}\text{s}}=1.44\,{\text{M}\text{S}/\text{s}}\). The differential voltage \(V_{\mathrm{inp}}-V_{\mathrm{inn}}\) has been set to \(200\,{\text{m}\text{V}}\). The conversion has been triggered after power-up and settling of the generated common mode voltage \(V_{\mathrm{CM}}\)

Table 3 Comparison to similar SAR-ADCs

5 Conclusion

A 12-bit SAR-ADC has been designed (using only open-source EDA tools) for the open-source SKY130 technology. It is highly configurable and features integrated decimation filters to increase the output word size up to 16 Bit. The integrated clock generator is hardened using a digital design workflow by extending it with custom-made cells. A production-ready layout (shown in Fig. 13, including an overlay indicating main blocks) has been generated with a digital-on-top design flow, integrating the synthesized RTL of the digital control block with custom analog layouts of the DAC, comparator, and \(V_{\mathrm{CM}}\) generator, next to the automatically hardened clock generator.

Fig. 13
figure 13

Floor plan of the proposed 12-bit SAR-ADC layout (\(434\,{{\upmu}\text{m}}\times 403\,{{\upmu}\text{m}}\)). The 3D visualization has been created using gdsiistl [13] and Blender [5]

The simulated performance of this ADC is comparable to similar SAR-ADCs that have been designed with commercial tools and demonstrates the progress that free and open-source integrated circuit design tools have made recently.