Journal of Electronic Testing

, Volume 28, Issue 5, pp 625–640

High Speed On-Chip Signal Generation for Debug and Diagnosis

Authors

  • Tsung-Yen Tsai
    • Integrated Microsystems Laboratory, Department of Electical & Computer EngineeringMcGill University
  • Sadok Aouini
    • Integrated Microsystems Laboratory, Department of Electical & Computer EngineeringMcGill University
    • Integrated Microsystems Laboratory, Department of Electical & Computer EngineeringMcGill University
Open AccessArticle

DOI: 10.1007/s10836-012-5289-0

Cite this article as:
Tsai, T., Aouini, S. & Roberts, G.W. J Electron Test (2012) 28: 625. doi:10.1007/s10836-012-5289-0

Abstract

This article presents methods and circuits for synthesizing test signals in the time/frequency domain. An arbitrary signal is first encoded using sigma–delta modulation in the digital amplitude-domain and converted to the time or frequency domain through a digital-to-time converter (DTC) or digital-to-frequency converter (DFC) operation realized in software. In hardware, the resulting bit-stream is inputted cyclically to a high-order phase-locked loop (PLL) behaving as a time-mode reconstruction filter in the appropriate domain (time or frequency). A high-speed prototype implementation consisting of a 4th order PLL built in 0.13 μm complementary metal oxide semiconductor (CMOS) process with an off-chip loop filter has been fabricated and used to generate signals at 4 GHz. The digital nature and portability of the phase/ frequency test signal generation process makes the proposed scheme compatible with the IEEE 1149.1 test bus standard and easily amenable to any testing environment: production, characterization, design-for-test (DFT), or built-in self-test (BIST).

Keywords

Analog test Mixed-signal test Design-for-test Built-in self-test Phase generation Frequency synthesis Sigma-data encoding Integrated circuit Phase-locked loop

1 Introduction

The ability to generate high-frequency test signals on-chip that can be made to vary over frequency and phase under external program control provides a useful debug and diagnosis tool (Fig. 1). As operating frequencies and pin counts rise, it is increasingly difficult and costly to route high-speed signals on and off-chip. Many factors need to be considered, such as impedance matching, mismatch uncertainty, crosstalk, and clock skew among others, requiring extensive analysis. Also, adding pins in an already crowded package exclusively for testing may not be acceptable in all situations. As such, design-for-test (DFT) techniques that use circuitry already available on-chip is preferable.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig1_HTML.gif
Fig. 1

System implementation of on-chip high-frequency generator utilizing the 1149.1 test bus standard

Traditionally, information has been processed and encoded in the voltage/amplitude domain; however, more recently the encoding of information in time has gained considerable popularity [23] and [6]. Time mode signal processing involves encoding the information in the form of time difference variables using phase modulation. A digital-to-time converter (DTC) can be seen as any device used to map a digital value to a time-based signal, similar to a digital-to-analog converter (DAC) in the voltage/amplitude domain. Likewise a digital-to-frequency converter (DFC) converts a digital code to a corresponding instantaneous frequency. From a system perspective, such a conversion process should always be combined with a reconstruction filter to eliminate the images in the output spectrum. In [2], it has been shown that a phase-locked loop can be used as a reconstruction filter for DTCs.

In this work we limit our discussion to the generation of high-frequency signals on-chip under external program control. As depicted in Fig. 2, the phase/frequency signal generation process consists of first encoding a DC signal using sigma–delta modulation. Sigma-delta modulators are basically oversampling analog-to-digital converters, meaning that the sampling rate of the signal is increased to the point where a low-resolution quantizer is enough for accurate digitization. Then, the sigma–delta encoded digital signal is converted to the time or frequency domain through a DTC or DFC process. Finally, a PLL is used as a time/frequency domain analog reconstruction filter. The hardware implementation consists of only a periodic bit-stream containing the sigma–delta encoded phase or frequency signal and a PLL. In fact, it can easily be incorporated within a DFT or BIST framework, such as that described in [18], or in a boundary-scan compliant scheme [8], as depicted in Fig. 1. Using an existing BIST framework allows costs to be lowered, as the IEEE 1149.1 test bus allows the sigma–delta bitstream to be routed to an already existing PLL, then the output can be routed to a device under test. This lowers both the cost and difficulty of implementation.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig2_HTML.gif
Fig. 2

Phase/frequency test signal generation process where a DC signal is encoded using sigma–delta modulation, converted to phase or frequency through a DTC or DFC, and reconstructed using a PLL serving as time/frequency domain filter

The test implementation as presented is appropriate for testing analog circuitry. For testing of digital circuits, proper buffering and/or level shifting may be required. As the PLL in this implementation is used as a time domain filter, any PLL with the appropriate bandwidth may be used. PLLs have been demonstrated to be built in 90 nm and 45 nm technologies; as such, the technique should be fully compatible with them. As this technique uses a bitstream and a PLL, it is scalable with technology and a higher speed implementation would be realizable.

This paper is divided as follows: first, the phase and frequency encoding process using sigma–delta modulation is described in Section 2. In Section 3, MATLAB/Simulink simulations are used to validate the technique. In Section 4, the design of a prototype phase/frequency test signal generator incorporating a custom high-speed PLL built in CMOS 0.13 μm and running at 4 GHz, and custom PCB used to interface the PLL chip to the test equipment are described. The experimental results are then outlined in Section 5 and finally, conclusions are drawn and future potential avenues are discussed in Section 6.

2 Phase and Frequency Encoding Using Sigma–Delta Modulation

In this section, an overview of the phase encoding process using sigma–delta modulation and digital-to-time conversion techniques is presented and then extended to frequency encoding using digital-to-frequency conversion.

2.1 Phase Encoding

A digital input can be converted to a phase modulated signal through a digital-to-time conversion process as described in [1]. Referring to Fig. 3, the process is as follows: a parallel multi-bit digital input is applied to the input of a DTC and a serial output with the corresponding time delay is generated. The mapping process between the input bits and the output time signal is assumed to be represented by
$$ t_{\rm out} = t_{\rm ref}\left(b_0 + b_12^1 + b_22^2 + ... + b_{N-1}2^{D-1}\right) + t_{\rm os}, $$
(1)
where t out is the time output, t ref the reference time, t os a time offset and b the digital input.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig3_HTML.gif
Fig. 3

A D-bit digital-to-time converter showing digital input gives a corresponding time output

A way to understand the DTC mapping process is to look at the output sequence in terms of the input sequence, as illustrated in Fig. 4. In the case of a 1-bit DTC having a 90-degree phase encoding range, every ‘0’ value in the digital domain is mapped to the bit sequence ‘1100’ and every ‘1’ is mapped to the sequence ‘0110’. Note here that the same concept can be extended to a multi-bit, different phase range and duty cycle encoding. In this case, to preserve the sampling frequency, the output phase code should be clocked four times faster than the input amplitude code. The PLL will also lock on to this sampling frequency, as it is the carrier frequency for this phase modulation scheme.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig4_HTML.gif
Fig. 4

Illustrating the phase encoding process for a 1-bit DTC having a 90-degree phase encoding range

Digital-to-time conversion can be combined with sigma–delta modulation to digitally encode a phase signal, as depicted in Fig. 2. The amplitude domain sigma–delta modulator followed by the DTC is equivalent to a sigma–delta modulation process occurring in phase. Hence, all the parameters of the sigma–delta modulator are mapped in a one-to-one correspondence to the phase domain. The modulator order, bandwidth and SNR should be equivalent in both domains. Thus, the maximum value of the sigma–delta modulated signal in the amplitude domain, ΣΔMAX, is mapped to the maximum phase shift ϕ MAX; likewise, the minimum value of the sigma–delta modulated signal in the amplitude domain, ΣΔMIN, is mapped to the minimum phase shift ϕ MIN, without loss of generality if it is encoded using a single or multi-bit conversion. The amplitude to phase mapping coefficient is defined as
$$ \alpha_{\phi} = \frac{\phi_{\rm MAX} - \phi_{\rm MIN}}{\Sigma\Delta_{\rm MAX} - \Sigma\Delta_{\rm MIN}} \left[\frac{\it rads}{V}\right] $$
(2)
This equation defining α ϕ can also be seen as taking the full-scale range of the DTC over the full-scale range of the sigma–delta converter. In addition, it must be noted that an offset term ϕ os can also be present to link the output instantaneous phase ϕ out and the DTC input, denoted as DTC in, as given by
$$ \phi_{\rm out} = \alpha_{\phi} DTC_{\rm in} + \phi_{\rm os} $$
(3)

It can be noted here that Eqs. 1 and 3 are related. In fact, in Eq. 1 we can convert t ref to the phase domain by multiplying it by ω s (ω s  = 2π/t S ), where t S is sampling period of the DTC as shown on Fig. 1 to give α ϕ . Likewise t os of Eq. 1 can be converted to t os by the same relationship.

Since the DTC relates an input amplitude to a corresponding output phase signal by multiplying it by α ϕ , the spectrum of the DTC output signal can be written in terms of the sigma–delta PSD output as (Fig. 5)
$$ S_{\rm DTC}(f) = \begin{cases} \left(\alpha_{\phi}\sqrt{S_{\Sigma\Delta}(f)} + \phi_{\rm os}\right)^2 & f = 0 \\[5pt] \alpha_{\phi}^2 S_{\Sigma\Delta}(f) & f \neq 0 \end{cases} $$
(4)
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig5_HTML.gif
Fig. 5

PSD mapping from the amplitude domain through the DTC block to the phase domain

Given that the PSD of the sigma–delta output can then be decomposed into a signal and noise component S S,ΣΔ(f) and S N,ΣΔ(f), respectively, this allows us to write the PSD of the DTC output as (for f ≠ 0)
$$ S_{\rm DTC}(f) = \alpha_{\phi}^2 S_{S,\Sigma\Delta}(f) + \alpha_{\phi}^2 S_{N,\Sigma\Delta}(f) $$
(5)
It can be observed that a carefully designed phase-filtering function, implemented by the PLL, must be realized to properly filter the out-of-band quantization noise. To ensure that the in-band noise level is negligible, the phase-filtering function of the PLL must at least match the bandwidth and have a higher order than the sigma–delta modulator. Consequently, the PLL phase transfer function must be designed accordingly. However, just as for the amplitude domain, if the signal encoded using sigma–delta modulation has a smaller bandwidth than the modulator, the order of the filtering function of the PLL can be relaxed by also lowering its bandwidth. The SNR of the overall process can then be found using the terms above and leading to show that the SNR in the phase domain is indeed equal to the one in the amplitude domain, i.e.,
$$ SNR_{\phi} = \dfrac{P_{S,\phi}}{P{N,\phi}} = \dfrac{\alpha_{\phi}^2\int_0^{f_B}S_{S,\Sigma\Delta}(f)df}{\alpha_{\phi}^2\int_0^{f_B}S_{N,\Sigma\Delta}(f)df} = SNR_{\Sigma\Delta} $$
(6)

2.2 Frequency Encoding

Just as for the DTC, the DFC is used to convert a digital input signal to a corresponding frequency. The general equation relating the input bits and instantaneous output frequency can be written as
$$ f_{\rm out} = f_{\rm ref}\left(b_0 + b_12^1 + b_22^2 + ... + b_{N-1}2^{D-1}\right) + f_{\rm os} $$
(7)
where f out is the time output, f ref the reference time, f os a time offset and b the digital input.
Once again, recognizing that the operation of the DFC is to take as input a bit-stream consisting of D-bit words and create as output a 1-bit bit-stream whereby each input word is mapped to a corresponding sequence of bits representing a particular frequency component. For example, for a 1-bit DFC consisting of two frequencies, f 1 and f 2 where f 2 = 2 * f 1, the operation of the DFC is to convert a logical ‘0’ input bit to a ‘1100’ output sequence, and a logical ‘1’ input bit to a ‘1010’ output sequence, at a clock rate four times the original bit stream clock rate. The mapping algorithm is demonstrated in Fig. 6.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig6_HTML.gif
Fig. 6

Amplitude-to-frequency mapping where every ‘1’ value is mapped to a frequency twice the rate of a ‘0’ value

Just as in for phase synthesis, the same could be said for frequency synthesis. With regards to the DFC process, the sigma–delta bits are mapped to instantaneous frequencies. Here, the maximum value of the sigma–delta modulated signal in the amplitude domain, ΣΔMAX, is mapped to the maximum frequency f MAX; likewise, the minimum value of the sigma–delta modulated signal in the amplitude domain, ΣΔMIN, is mapped to the minimum frequency f MIN, without loss of generality if it is encoded using a single or multi-bit conversion. Here again, a mapping coefficient between the amplitude and frequency domain can be defined as
$$ \alpha_{f} = \frac{f_{\rm MAX} - f_{\rm MIN}}{\Sigma\Delta_{\rm MAX} - \Sigma\Delta_{\rm MIN}} \left[\frac{\it Hz}{V}\right] $$
(8)
Note that α f can be seen as being the full-scale range of the DFC divided by the full-scale range of the sigma–delta modulator. In addition, it must be noted that an offset term f os can also be present when linking the output frequency f out and the DFC input DFC in as given by
$$ f_{\rm out} = \alpha_{f} DFC_{\rm in} + f_{\rm os} $$
(9)

It can be noted here that Eqs. 7 and 9 are related. In fact, in Eq. 7, f ref can be expressed as α f . Likewise f os in both equations are actually the same.

The spectrum of the DFC output signal can then be written like before in terms of the sigma–delta PSD output as
$$ S_{\rm DFC}(f) = \begin{cases} \left(\alpha_{f}\sqrt{S_{\Sigma\Delta}(f)} + f_{\rm os}\right)^2 & f = 0 \\[5pt] \alpha_{f}^2 S_{\Sigma\Delta}(f) & f \neq 0 \end{cases} $$
(10)
which we can then write in a more detailed form (for f ≠ 0) as
$$ S_{\rm DFC}(f) = \alpha_{f}^2 S_{S,\Sigma\Delta}(f) + \alpha_{f}^2 S_{N,\Sigma\Delta}(f) $$
(11)

Assuming the quantization noise carried over from the sigma–delta encoding process is removed by a filtering function realized by the PLL, one can also show, following the previous subsection arguments, that the SNR of the DFC process has the same SNR as established by the sigma–delta encoding process.

3 MATLAB Modelling and Simulation Results

A third-order sigma–delta modulator was implemented in MATLAB. This allows for relatively quick simulation and verification of any changes made to the modulator, such as bandwidth or sampling frequency. The modulator was designed with a sampling frequency of 65 MHz, an OSR of 16, resulting in a bandwidth of approximately 2 MHz. It was designed using DSMOD [5] and simulated in Simulink using the general model shown in Fig. 7. The output PSD of the modulator is shown in Fig. 8. A Kaiser window with beta of 30 was used. The marker designates the input DC level (0.434).
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig7_HTML.gif
Fig. 7

General sigma–delta modulator structure

https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig8_HTML.gif
Fig. 8

Output PSD of sigma–delta modulator with 0.434 DC input

The closed-loop response of the PLL can be seen in Fig. 9. The calculated transfer function (solid line) is compared with the transfer function of a linearized model (line with “O” markers) from Simulink. It can be seen that the two match exactly. The loop filter response is plotted in Fig. 10, with the calculated transfer function in solid and the linearized model with “O” markers. It can be observed that a pole at DC is present.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig9_HTML.gif
Fig. 9

PLL closed-loop transfer function

https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig10_HTML.gif
Fig. 10

PLL loop filter transfer function

3.1 System Simulation

Figure 11 outlines the implementation of the phase/ frequency synthesis system. Going back to Fig. 2, the hardware implementation consists of a cyclic memory with the sigma–delta encoded phase signal applied to a high-order PLL behaving as a time-domain filter to eliminate the quantization noise. Thus, if the PLL bandwidth also matches that of the sigma–delta modulator, the encoded phase signal can accurately be recovered. The sampling frequency of the DFC is 180 MHz; correspondingly, the two encoded frequencies are 45 and 90 MHz, while the DTC is sampled at 260 MHz, which results in a carrier at 65 MHz. As shown in Fig. 11, the output of the DFC/DTC is stored in memory and repeated so that it appears as a constant, uninterrupted bitstream.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig11_HTML.gif
Fig. 11

Implementation of phase/frequency synthesis system

The frequency synthesis system was simulated with Simulink for three different DC input conditions; specifically, 0.434, 0.452, and 0.470 on a sigma–delta output scale of 0 to 1. The corresponding output spectrums are superimposed on the plot shown in Fig. 12. As is evident, the PLL is locked at 4.13 GHz, 4.18 GHz and 4.23 GHz. These frequencies correspond exactly as that calculated by Eq. 9 when scaled by the PLL divider ratio of 64. Likewise, the phase synthesis system was simulated with a 0.01 amplitude, 2 MHz sinusoidal input. The output spectrum of the PLL can be found in Fig. 13, where a 2 MHz sideband is centred around a 4.16 GHz carrier (65 MHz × 64).
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig12_HTML.gif
Fig. 12

Three tones produced by the frequency generator

https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig13_HTML.gif
Fig. 13

Output spectrum of 0.01 amp, 2 MHz phase-encoded signal with 4.16 GHz carrier

4 Phase/Frequency Generator Implementation

The phase/frequency generator consists of a cyclic memory element, containing a phase or frequency encoded sigma–delta bitstream, and a custom phase-locked loop whose bandwidth corresponds to that of the software-based sigma–delta modulator. This system is shown in Fig. 11. The details of both the cyclic memory and the design of the custom PLL are described, as well as the design of a PCB that allows the PLL to interface with test equipment.

4.1 Cyclic Memory

The phase/frequency generator requires a cyclic memory element in order to present an uninterrupted phase or frequency encoded bitstream to the PLL. This is accomplished by using an external pattern generator, further described in the Experimental Results section. Any programmable memory element can be used for this purpose (e.g., a FPGA).

4.2 Custom PLL Design

In order to test frequency and phase synthesis at high speeds, a custom PLL had to be designed and built. A top-down design methodology was employed to impose a desired phase transfer function, as described in [1]. The IBM cmrf8sf 130 nm process was chosen as the technology for fabrication.

4.2.1 Transistor-Level Design

After the phase transfer function of the PLL has been determined, each component of the PLL was designed and implemented at the transistor level using Cadence. A general block diagram of the components comprising a typical charge pump PLL is shown in Fig. 14.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig14_HTML.gif
Fig. 14

General block diagram of charge pump PLL

Phase-Frequency Detector   The role of a phase-frequency detector (PFD) is to compute the phase and frequency error between the input reference signal and the voltage-controlled oscillator output. This information is delivered in terms of up/down pulses; this implies that there are three states, up, down, and no output. A fourth possible state, both outputs on, is an invalid state and triggers a reset through an AND gate. A more through description of how PFDs operate can be found in [13]. The PFD used is implemented with two D flip-flops, as shown in Fig. 15. The upper output is “up_bar” due to the specific implementation of the charge pump, described in more detail in the next section. The “down” output has a non-inverting buffer in order to somewhat equalize the delay introduced by the inverting buffer needed to invert the “up” signal.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig15_HTML.gif
Fig. 15

Block diagram of PFD

A true single-phase clocked (TSPC) based D flip-flop [9] was used in this design because of the higher speeds achievable than pure CMOS logic. TSPC also has further benefits than typical precharged logic because, as its name implies, it requires only one clock phase as opposed to two. This contributes to simpler clock distribution and layout [12]. The positive-edge triggered D flip-flops [9] in Fig. 16 are modified to include a reset and feature a shorter path from input to output for shorter propagation delay and faster operation.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig16_HTML.gif
Fig. 16

Schematic of modified TSPC D flip-flop [9]

Charge Pump   The charge pump translates pulses from the PFD into single-ended current for input into the loop filter. The charge pump design chosen is of the switch-at-source type, which allows the charge pump to switch faster as the switching transistors (connected to directly to the PFD output) are connected only to the source of one transistor, resulting in less parasitic capacitance [15]. This design, the current-matching charge pump [19], is shown in Fig. 17. It utilizes negative feedback to adjust the up/down branch reference currents according to the output voltage, resulting in less current mismatch. It also does not require an error opamp, reducing the required die space. Transistors M11–M16 form a current mirror to supply the reference current, with an input bias voltage (Vref_CP). For sizing, longer transistors are used to reduce channel-length modulation effect. Common centroid layout was used wherever possible to aid in matching.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig17_HTML.gif
Fig. 17

Schematic of charge pump [10]

Voltage-Controlled Oscillator   The voltage-controlled oscillator (VCO) translates voltage from the output of the loop filter into a corresponding frequency. As the intent of the PLL is to demonstrate the concept of frequency generation via software-based sigma–delta modulation, the encoding may require as much as 25% tuning range. In order to meet this need, a ring oscillator topology was chosen for the VCO. As a high operational frequency is desired, a somewhat more complicated delay cell than an inverter would have to be used. Here, a multiple-pass ring oscillator-based VCO [3] was chosen. The block diagram showing how the delay cells are connected is shown in Fig. 18, while the schematic of an individual delay cell can be found in Fig. 19. The delay cell is differential in nature, with a single-ended control voltage. It also has two inputs that are connected to the previous stage (p+, p−), and two inputs that are connected to the outputs that are two stages before the current one (s+, s−). This allows it to work much like precharged logic; the output node is already partially charged when the input from the previous stage goes high. Three delay cells are used, which is the minimum required for oscillation to occur. This allows for maximum oscillation frequency. RF transistors from the cmrf8sf library were used. The VCO is extremely sensitive to layout (most likely due to parasitics); three iterations had to be completed before proper operation was achieved.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig18_HTML.gif
Fig. 18

Multiple-pass VCO block diagram [3]

https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig19_HTML.gif
Fig. 19

Schematic of VCO delay cell [3]

Frequency Divider   The basic frequency divider topology used is based on a D flip-flop with its inverting output fed back to the input. This has a frequency division ratio of two. The D flip-flip is implemented in a master-slave fashion with two latches. One latch takes as input an unmodified clock, and the other an inverted clock. This allows the resulting flip-flop to be edge-sensitive [12]. A block diagram of a differential implementation of the frequency divider can be found in Fig. 20.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig20_HTML.gif
Fig. 20

Block diagram of frequency divider [14]

Initially, all frequency dividers were planned to be TSPC-based. However, simulations showed that this design was not fast enough for the target speed of the voltage-controlled oscillator (roughly 10 GHz). Therefore, a modified current-mode logic (CML) latch-based divider [22] was used to halve the frequency before the TSPC-based frequency dividers. This modified CML-based frequency divider, first introduced in [14], differs from traditional CML latches mainly in that it has an active load as opposed to a resistive load. It also removes the tail current sources to reduce transistor stacking, making it more suitable for low supply voltages (1.2 V in the case of IBM 130nm cmrf8sf) of current-day submicron technologies. To achieve the designed division ratio of 64, one CML divider was used, followed by five TSPC dividers. The TSPC dividers were preferred because of their lower current draw and smaller die space requirement. The schematic of the CML divider and of the TSPC latch that the divider is based on can be found in Figs. 21 and 22, respectively.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig21_HTML.gif
Fig. 21

Schematic of CML frequency divider [22]

https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig22_HTML.gif
Fig. 22

Schematic of TSPC latch [12]

Loop Filter   The loop filter is essential for the PLL to accurately represent the phase transfer function as designed with the top-down method. The transfer function of the loop filter is third order and is as follows.
$$ F(s) = \frac{2.405 \times 10^{11} s + 6.182 \times 10^{17}}{0.0153 \times s^3 + 1.771 \times 10^{5} s^2 + 1.62 \times 10^{12} s} $$
(12)

Initially, the loop filter was planned to be implemented on chip. A passive, LC-ladder based approach was first considered; however, restrictions on zero placement made it unsuitable for the transfer function of the filter. Gm-C filters were evaluated next as an option. For a filter of with a bandwidth of around 1 MHz, the gm values required from each cell would be on the order of 1 × 10 − 6 to 1 × 10 − 7. This is difficult to achieve in a gm-cell, as the bias currents would have to be correspondingly small as well. For similar reasons, an active RC implementation would be less than ideal, due to the large RC constants (and capacitors) that would be required. Switched-capacitor filters were considered, but with time restrictions imposed by the tape-out deadline, it was decided to move the loop filter off-chip.

The 3rd order off-chip filter (as shown in Fig. 23) is implemented using active RC, with a cascade of a Tow-Thomas biquad and a first-order integrator section. The second section is required to implement a pole at DC with an additional zero (leaky integrator). It has a transfer function (G(s) = G 1(s)G 2(s)) that consists of two parts,
$$ G_1(s) = \frac{1.332 \times 10^{14}}{s^2 + 1.133 \times 10^{7} s + 1.037 \times 10^{14}} $$
(13)
$$ G_2(s) = \frac{3.891 \times 10^{-7} s + 1}{3.367 \times 10^{-6} s} $$
(14)
whose product is equal to Eq. 12. Comparing each of these transfer functions to the symbolic transfer functions of the two filter sections, i.e.
$$ G_1(s) = \frac{-\frac{1}{R_2R_4C_1C_2}s}{s^2 + \frac{1}{R_1C_1}s + \frac{1}{R_2R_3C_1C_2}} $$
(15)
$$ G_2(s) = \frac{-R_7C_3s + 1}{R_8C_3s} $$
(16)
the resistor and capacitor values can be derived. The reference resistor value is set at 10 kΩ to reduce the current needed to drive the filter, as the charge pump has limited current drive capability. As such R 1, R 3, and R 4, were chosen to be this value, and C 1 was chosen to be 10 pF. R 2 and C 2 were then computed from the corresponding coefficients. Similarly, C 3 was chosen to be 120 pF for the last stage, and R 7, and R 3 calculated. At the input of the filter, there is a resistor connected to analog ground (0.6 V) to allow the charge pump output to swing around the reference. Present at the output are a pair of Schottky diodes, which prevent damage to the VCO by limiting the range of the output voltage to 0–1.4 V. Higher swing is possible because the opamps used for the filter (AD8045) are powered from ±5 V.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig23_HTML.gif
Fig. 23

Loop filter schematic [4, 17]

Input/Output Circuitry   The PLL contains input/output circuitry on-die to aid in interfacing to off-chip sources and equipment. The input has a level shifter [12] that accepts a 3.3 V signal and shifts it down to 1.2 V for the PFD input. The 3.3 V I/O (thick-oxide) transistors used at the input of the level shifter also has a larger maximum allowed V DS before breakdown than standard 1.2 V transistors, allowing for a larger margin of error when connecting or setting up input sources. The level shifter also passes a 1.2 V input signal. The circuit diagram can be found in Fig. 24.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig24_HTML.gif
Fig. 24

Input buffer schematic [21]

The output drivers consist of a chain of differential CML buffers. The transistor sizes of the chain is tapered (from small to large) so that a relatively large off-chip load could be driven with a reasonable propagation delay. This concept is similar to sizing a chain of inverters for minimum delay [12]. A total of fives stages were used. The driver was tested with a 50 Ω terminated, 20 pF load. This results in an approximate 200 mV swing. Due to the large currents required to drive this load, the resistors used had to be carefully sized to ensure they would be able to handle the amount of current required. These polysilicon resistors were chosen because of their low variability over voltage and temperature. The schematic of a single CML buffer stage can be found in Fig. 25.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig25_HTML.gif
Fig. 25

CML buffer schematic [7]

4.3 PCB Considerations

In order to interface between the PLL to off-board equipment and instrumentation, a printed-circuit board (PCB) must be designed and fabricated.

Chip-Bonded PCB   The PCB is intended to host the PLL die as well as the loop filter. It incorporates design features which aid in high-speed operation, such as microstrips, 50 impedance SMA connectors, and Rogers 4003 material (for greater dielectric uniformity).

Input/Output   The input and output of the PCB are equipped with SMA connectors. The input is terminated with a 50 Ω resistor. The high-speed differential outputs, however, needed a transmission line to be correctly terminated at high frequencies. Striplines and microstrips were considered; microstrips were chosen because striplines would be very difficult to debug due to their embedded nature. Rogers 4003 was selected as the dielectric material, which has ε r of 3.55. The dielectric thickness is 8 mils. Based on this information, the following equations from [11] were used to calculated the characteristic impedance Z o , i.e.,
$$ Z_o = \frac{120\pi}{\sqrt{\varepsilon_e}[W/d + 1.393 + 0.667 ln(W/d + 1.444)]} $$
(17)
where
$$ \varepsilon_e = \frac{\varepsilon_r + 1}{2} + \frac{\varepsilon_r - 1}{2} \frac{1}{\sqrt{1+12d/w}} $$
(18)

The width of the line was found to be 17 mils, which gives a Z o of approximately 51.9 Ω. These microstrips were selected for impedance control for manufacturing, so a test coupon was created to measure their impedance. The test report states an impedance of 45.97 Ω on average, which is close to the desired value.

Component Selection   Due to concerns about the charge pump drive capability, efforts were made to minimize the parasitic capacitance offboard. To this end, package style of 0402 components were used, as well as the LFCSP package for the AD8045 opamp. Snap-on terminal blocks are used for the power connections to allow for quick connection and disconnection to the power supply, allowing for quick changes to be made to the PCB. Nylon standoffs were used on the corners of the PCB to reduce the chances of accidental shorts due to a messy workbench.

Die Bonding   Normally, the die would be bonded to a package with pins using bondwires. Flip-chip ball grid array (BGA) type packaging would be ideal as it has no bondwires, but it is very expensive (above chip fabrication cost). Therefore, it was decided to bond the die to the PCB itself. This would eliminate the additional parasitics of a package (excluding bondwires). This process is often referred to as COB (chip-on-board). The final board with bonded die can be seen in Fig. 26. The bonded die is marked with a white box in Fig. 26.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig26_HTML.gif
Fig. 26

Front of PCB with PLL die marked using white square at centre of diagram

5 Experimental Results

Using the PCB with bonded PLL die, the functionality and performance of the phase and frequency signal generation system is explored. Some of these results were partially presented in [20]; the test results are presented in their entirety here.

5.1 Test Setup

In order to implement the phase/frequency generation system, a register or some other kind of memory is required. A sigma–delta bitstream is stored in this memory and looped to create a continuous bitstream. On-chip, an existing IEEE 1149.1 bus can be utilized to receive data from off-chip and store the data into a register that is routed to the components under test. A phase-locked loop is then used as a reconstruction filter as well as provides a means of scaling up the output frequency. Figure 27 shows the way the system is represented by the test setup. The sigma–delta modulator and the DFC/DTC are implemented in software, and the resultant output is looped to create an uninterrupted, continuous bitstream. As this is identical to the simulation setup, the bits are saved from Matlab and loaded into a Hewlett Packard 81130A pattern generator. It is programmed over GPIB via a Perl script, which configure the operating parameters (such as voltage levels, bit rate, etc.) as well as load the bits into memory. Time domain measurements were captured with an Agilent 1169A active differential probe connected to an Agilent Infiniium DSA80000B oscilloscope, while frequency domain measurements were obtained with the output connected directly to an Agilent MXA N9020A spectrum analyzer. The power supplies used were Agilent E3649A for all rails except analog ground, in which case the Hewlett Packard 6633A was used. The 6633A is able to sink current, which is important as the output signal of the charge pump is designed to swing around analog ground.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig27_HTML.gif
Fig. 27

Phase/frequency generation system test setup

The PLL die micrograph is shown in Fig. 28. The die is 1 mm x 1 mm, and the core PLL area (including output driver) is about 158 μm by 195 μm.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig28_HTML.gif
Fig. 28

PLL die micrograph

5.2 Clock Input

Prior to testing frequency and phase signal generation, some basic PLL functionality has to be verified. For this purpose, the aforementioned pattern generator and an Agilent 33250A function generator was used to drive the PLL input with a clock. The function generator was used to sweep the input to check the lock and capture range of the PLL. It was found with the VCO powered from 1.2 V, the capture range is around 59 MHz to 69 MHz (3.776–4.416 GHz output). With the VCO being powered from 1.3 V, the capture range is 72.5 MHz to 80 MHz (4.640 GHz–5.120 GHz output). The lock range is found to be similar. This operational region is stable across several samples tested (three dies/boards). After this region, increasing the input frequency results in output from the VCO that is distorted and non-sinusoidal. However, it can observed that the PLL is tracking the input frequency on the spectrum analyzer as the output frequency is changing. Increasing the PLL input frequency past this region usually results in a narrow band of operation that locks with a sinusoidal output; however, the location of the region varies with the die tested (between 5 and 6 GHz roughly). Further tests are performed in the 59–69 MHz input range, as this is the widest region of operation and is constant across different dies.

The phase noise is measured with a clock for different frequencies. The spectrum is captured using the spectrum analyzer, averaged, and the phase noise calculated using
$$ L\{\Delta\omega\} = 10\log_{10}\left(P_{\rm carrier}\right) - 10\log_{10}\left(\frac{P_{\rm noise} @ \Delta\omega}{\it ResBW}\right). $$
(19)
The results are between −70 and −60 dBc/Hz. A typical capture of the spectrum is shown in Fig. 29 for a 67 MHz input clock. The screen shows carrier power of −18.822 dBm, noise power of −49.355 dBm at 1 MHz offset, and resolution bandwidth of 10 kHz, giving an output phase noise of about −70 dBc/Hz @ 1 MHz offset. A time domain plot of the PLL output with a 65 MHz clock input is shown in Fig. 30. It can be seen the output is sinusoidal-like. The plot also show the cycle-to-cycle jitter measured with a histogram one edge away from the trigger; the standard deviation as measured is 1.02 ps. The results from the clock input measurement tests are summarized in Table 1.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig29_HTML.gif
Fig. 29

PSD of PLL output with 67 MHz clock input

https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig30_HTML.gif
Fig. 30

Cycle-to-cycle jitter of PLL output for 65 MHz clock input

Table 1

Clock measurement results

Parameter

Result

Lock/Capture range @ 1.2 V

59 MHz–69 MHz

Lock/Capture range @ 1.3 V

72.5 MHz–80 MHz

Phase noise, 67 MHz input

−70 dBc/Hz @ 1 MHz

Cycle-to-cycle jitter, 67 MHz input

1.02 ps

5.3 Frequency Signal Generation

The PLL was used to synthesize signals of varying frequencies. The frequencies generated can be found in Table 2. Figure 31 shows the output spectrum of the PLL. The frequency is identical to that which we found in simulation (Fig. 12) and predicted by Eq. 9. The measured phase noise is −42 dBc/Hz at 20 MHz offset. Figure 32 shows a time domain capture of the sinusoidal PLL output. It also shows the cycle-to-cycle jitter as measured from the next rising edge from the triggered edge, which is approximately 1.48 ps. Figure 33 shows the output voltage (peak-to-peak) versus frequency over a 500 MHz frequency range.
Table 2

Frequencies generated

DC level

DFC carrier frequency (MHz)

PLL output frequency (GHz)

0.434

64.530

4.1299

0.452

65.340

4.1818

0.461

65.745

4.2077

0.470

66.150

4.2336

0.491

67.095

4.2941

https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig31_HTML.gif
Fig. 31

Measured spectrum of PLL output for 0.434V DC input

https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig32_HTML.gif
Fig. 32

Time domain capture of PLL output for 0.434V DC input

https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig33_HTML.gif
Fig. 33

Output voltage vs. freq. over 500 MHz range centred at 4.2 GHz

A potential application of frequency generation is testing for frequency response. Two off-chip microstrips, (the same as the ones used for differential output on the bonded die PLL on a solder sample PCB were connected in series. One end served as the input, the other the output. These microstrips were characterized with the frequency generator and the results compared with an off-chip source (a Centellax TG1C1-A clock synthesizer). The results correlate reasonably well. A comparison can be seen in Fig. 34.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig34_HTML.gif
Fig. 34

Comparison of stripline characterization with frequency signal generator and external synthesizer

5.4 Phase Signal Generation

The basic principle idea of phase signal generation is the ability to move an edge of a signal with respect to a reference signal. To demonstrate this, the pattern generator is loaded with various phase-encoded sigma–delta bitstreams with different DC codes as input to the PLL. Figure 35 shows one example of the time domain PLL output. The signal is displayed with colour grade, showing how much the signal is moving. The reference clock is taken from another output of the pattern generator. The delay in Fig. 35 is measured as 256.53 ps. Table 3 shows the delay for each input DC code. The adjusted offset is in case that the delay is greater than the period of the signal, in which case the period is subtracted from the measured delay value. When the delay grows greater than the period, the measured offset will decrease. This can be seen when the DC code is changed from 0.443 to 0.452. As the output frequency is 4.16 GHz (65 MHz input multiplied by 64), the period is 240.4 ps. The delay can be observed to be increasing with greater DC values. The measured and simulated delay versus the output DC code is shown in Fig. 36. The DC codes in Table 3 were used for both system measurements and Simulink. As can be observed from the line with ‘+’ markers, the measured results are very close to Simulink results. Errors can be accounted for from the uncertainty in the measurements due to jitter, as well as additional delays introduced by cables and other measurement equipment. The adjusted measurement results can be seen with a dot-dash (.−) line, and closely tracks the simulated results.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig35_HTML.gif
Fig. 35

Time domain PLL output with respect to reference clock, 0.434V DC input

Table 3

Phase output results

DC code

Measured offset (ps)

Adjusted offset (ps)

0.4302

234.30

234.30

0.434

236.53

236.53

0.443

281.00

40.6

0.452

69.74

69.74

0.461

121.99

121.99

0.470

155.34

155.34

0.491

228.73

228.73

https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig36_HTML.gif
Fig. 36

Measured and simulated output phase vs. DC code

To demonstrate a potential application of phase signal generation, the jitter transfer function of the PLL was characterized using sinusoidal phase-modulated signals [1]. The amplitude of the output phase with respect to the carrier was compared to the amplitude of the input signal applied to the sigma–delta modulator. A total of 22 points were taken. The carrier of the DTC output is at the 65 MHz input, giving an output PLL carrier frequency of 4.16 GHz. The pattern generator sampling frequency is 260 MHz. The measured phase noise is −55 dBc/Hz at 1 MHz offset for a DC input code of 0.434. The comparison between the ideal transfer function (solid line) versus the measured PLL transfer function (dashed line) can be found in Fig. 37. The correlation is reasonable.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig37_HTML.gif
Fig. 37

Measured PLL jitter transfer function vs. ideal at 4.16 GHz

6 Future Research

For future research, a 3rd order PLL with an on-chip, 2nd order integrated RC-filter would be investigated with regards to its performance and area. This would aid in an analysis of the performance trade-offs between the advantages of having an integrated, lower-order filter versus having a lower-order (and therefore lower SNR) sigma–delta modulator. For this implementation, the VCO gain is equal to 1 GHz/V and the charge pump current has a current of 125 μA. With respect to the loop filter, shown in Fig. 38, R 1 is equal to 3 kΩ C 1 to 250 pF and C 2 to 50 pF. Note that such component values are easily realizable on-chip. The overall jitter transfer function of the 3rd order PLL has a 1.2 MHz bandwidth and is shown in Fig. 39.
https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig38_HTML.gif
Fig. 38

Schematic of 2nd order RC loop filter [16]

https://static-content.springer.com/image/art%3A10.1007%2Fs10836-012-5289-0/MediaObjects/10836_2012_5289_Fig39_HTML.gif
Fig. 39

Overall jitter transfer function of proposed 3rd order PLL

7 Conclusion

A phase/frequency signal generator for high-frequency applications amenable to digital testing methodologies without additional test pins was presented. The generator was implemented by means of a cyclic memory element and a custom integrated PLL. Although phase noise performance is much less than a fractional-N synthesizer, it is nevertheless useful for various debug and diagnosis situations that a design or test engineer may be in. In future research, a complete on-chip implementation with integrated loop filter will be investigated.

Copyright information

© The Author(s) 2012