1 Introduction

Digital PLL architectures are gaining importance in the frequency synthesizer field, thanks to their versatility and scalability properties. Figure 3.1 shows a simplified scheme of a digital intensive fractional-N frequency synthesizer, also called a digital phase locked loop (DPLL). The frequency of the DCO output signal \(f_v\) is reduced by the divider in the feedback chain, generating a lower frequency signal \(f_d\). The phase detector (PD) compares the \(f_d\) signal phase with the phase of the signal generated by the reference \(f_r\) and creates the error signal e that through the digital filter controls the DCO frequency tuning word tw.

The frequency control word (FCW) of the fractional divider is quantized by a sigma delta modulator before being provided to the multi-modulus divider. To avoid spurs in the output spectrum the quantization noise introduced in the loop by this operation is cancelled-out by a digital to time converter (DTC), once the DTC transfer function gain is properly matched by digital correction algorithms [1]. Digital algorithms, in fact, provide the possibility to correct analog mismatches, non-linear transfer functions and PVT variations with small headroom in terms of area and power consumption with respect to fully analog systems. Without the quantization noise and considering a Type-II loop system the phase error between \(f_r\) and \(f_d\) is dominated by random noise. This property allows implementing a small dynamic range phase detector such as the 1-bit phase detector that reduces the power consumption and the area occupation with respect to the multi-bit phase detector approach or the standard charge-pump based PLL. The adoption of a DTC combined with the 1-bit phase detector, called Bang-Bang Phase Detector (BB-PD), and digital corrections enable then low power and low spur fractional-N frequency synthesis [1]. One of the main challenge in the design a low power sub-6GHz Bang-Bang Digital PLL is to reduce the locking time and meet at the same time the stringent phase noise mask. The locking time is exacerbated by the limited dynamic range of the Bang-Bang phase detector and mainly depend from the loop filter parameter.

Fig. 3.1
figure 1

Generic architecture of a digital phase locked loop (DPLL), synthesized standard-cell-based digital blocks are depicted in grey color

2 Digital PLL: Output Phase Noise and Locking Transient

In a digital locked loop as the one in Fig. 3.1 the digital filter and algorithms are clocked by a reference signal \(f_r\) and the system should be analyzed as a discrete time quantized system. A fair representation of this system can be done using a multi-rate discrete time model. But neglecting the folding effect and converting the multi-rate model at the DCO sampling rate, it is possible to derive the simplified model depicted in Fig. 3.2 that can be used to compute the noise transfer function of the reference phase noise \(\Phi _r\) and DCO phase noise \(\Phi _v\) to the output phase noise \(\Phi _{out}\) [2].

Fig. 3.2
figure 2

Equivalent model of an digital phase locked loop (DPLL) at the DCO sampling rate

In the represented model \(T_R\) is the reference period, \(K_{BPD}\) is the gain of the phase detector, \(\alpha \) and \(\beta \) are the loop filter parameters, \(K_T\) is the DCO period gain and N is the division factor. For a small PLL bandwidth, the equivalent continuous time \(G_{loop}(s)\) gain of the model is

$$\begin{aligned} G_{loop}(s) \approx K_{BPD} \left( \beta + \frac{\alpha }{s T_{R}} \right) \frac{K_T N}{s T_{R}}\,. \end{aligned}$$
(3.1)

The main assumption in a Bang-Bang PLL design procedure is that in the steady state condition the 1-bit phase detector is working in the random noise regime. In this condition the phase error \(\Delta t\) is close to zero and the gaussian distributed random noise toggles the BB output between \(+\)1 and −1. Under this assumption and with \(\sigma _{\Delta t}>>N \beta K_T\) the \(K_{BPD}\) value inside the \(G_{loop}(s)\) is related to the variance of the \(\Delta t\) signal [2] by the relation:

$$\begin{aligned} K_{BPD} \approx \sqrt{\frac{2}{\pi }}\frac{1}{\sigma _{\Delta t}} \quad . \end{aligned}$$
(3.2)

Due this dependence between the \(G_{loop}(s)\) and the variance of the \(\Delta t\), the mathematical optimization of the output jitter is more complex with respect to the fully analog PLL. In fact, the jitter \(\sigma _{\Delta t}^2\) of the \(\Delta t\) signal is composed of the jitter from the reference signal \(\sigma _{t_{r}}^2\) and the jitter from the feedback path \(\sigma _{t_{v}}^2\).

Fig. 3.3
figure 3

Generic DCO sampling rate model of an digital phase locked loop (DPLL) for transient analysis

By changing the bandwidth the output jitter also changes and so does the noise level at the phase detector input. This variation affects the \(G_{loop}(s)\) and therefore the PLL bandwidth. Moreover, the value of \(\sigma _{\Delta t}^2\) is the sum of the random noise jitter \(\sigma _{\Delta t,rn}^2\) plus the jitter related to the limit cycle \(\sigma _{\Delta t,lc}^2\), that is the periodic behavior of the state variable \(\Delta t\) induced by the loop quantization. The jitter due to the limit cycle is relevant only in the case of low phase noise. Considering no latency in the loop the \(N \beta k_T\) quantization sets the minimum achievable input jitter

$$\begin{aligned} \sigma _{\Delta t,lc} \simeq \frac{N\beta K_T}{\sqrt{3}}\,. \end{aligned}$$
(3.3)

As explained in [3] the optimum output jitter can be found when \(\sigma _{\Delta t,lc} \lesssim \sigma _{\Delta t,rn}\) Choosing \(\beta K_T N\) to keep the system in a random noise regime with low noise requirements, has an effect also on the locking time. To properly understand the transient behavior we have to further simplify the model in Fig. 3.2, taking into account also the non-linear characteristic of the phase detector. In fact, during the locking transient, the \(\Delta t\) saturates the phase detector and the random noise condition is not more valid. The phase detector can be represented with a sign function and the locking time can be evaluated considering the transient of the DCO period \(T_{DCO}[k]\) that is composed by the free running period \(T_0\) and the tuning component \(tw[k]\ K_T\). The equivalent model is depicted in Fig. 3.3. When the divider modulus changes from N to N+1 the additional cycle accumulated in the feedback counter increase \(\Delta t\) error saturating the phase detector. The constant output error is integrated by the loop filter and the frequency of the oscillator is changed to reduce the time error. Each time that the sign of the error e changes, the transient enters in a new locking segment and the absolute value of \(\Delta t\) is reduced. When the absolute value of the time error is comparable with the time error in the steady state condition, the system reaches the locking state. The overall locking transient behavior, composed by the the locking segments, is depicted in Fig. 3.4a.

Fig. 3.4
figure 4

Locking transient: (a) behavior of the main loop state variables, (b) \(\Delta t\) long locking transient for a noise optimized PLL

Defining the deviation of the DCO period from the final steady state DCO period \(T_{DCO}[\infty ]\) at each discrete step k as \(\Delta T_0[k]\) it is possible to demonstrate that the locking time is proportional to:

$$\begin{aligned} T_{locking} \propto T_{R} \frac{1}{R(2-R)} \left( \frac{\Delta T_0}{\beta K_T} \right) ^2 \end{aligned}$$
(3.4)

Where R is the ratio between \(\alpha \) and \(\beta \). Comparing random noise condition (\(\sigma _{\Delta t,lc} \lesssim \sigma _{\Delta t,rn}\) ) to the locking time (3.4), we can easily see that reducing \(\beta K_T\) improves the minimum jitter achievable by the system but heavily affects the locking time. With the PLL parameters carefully chosen to optimize the noise, the locking transient takes an unacceptable amount of time. For example, the locking time estimated with (3.4) for a frequency step of 100 kHz is around 9.3 ms as in Fig. 3.4b.

One way to overcome this trade-off is to change the architecture by having two separate loops: one designed for the steady-state random noise condition and the other for speeding-up the locking transient.

3 Multi-loop Architecture for Fast Locking Transient

A frequency synthesizer used as a local oscillator usually has to cover a wide tuning range to properly downconvert different standards or channel to the baseband. For example a 3.7 GHz DCO with a \(10\%\) of tuning range should cover a frequency range from 3.5 to 3.9 GHz. To cover this range with a single analog controlled varactor, keeping at the same time a fine frequency resolution (e.g. 10 kHz/LSB) is not feasible. In fact, to build the entire oscillator based on a single bank of minimum size digitally switched capacitive cells, leads to thousands of control wires and connections, spoiling the DCO performance in terms of area and noise. The commonly used approach is to design a segmented DCO, the entire tuning range is splitted into more overlapped tuning segments.

The fine DCO thermometric capacitive bank covers the tuning characteristic of one segment, while a coarser thermometric capacitive bank shifts this tuning segment up or down to create the overall DCO tuning characteristic. The number of elements for each segment and the relative gain \(K_{T}\) are limited by performance considerations and to avoid having a blind frequency region in the DCO tuning characteristic. A common rule of thumb is to size the LSB of each bank equal to half of the dynamic range of the immediately smaller bank. This ensures an overlap between two adjacent segments of more than the 50% and binds the maximum value of \(K_{T1}\) to the \(K_{T}\) value. The segmentation, properly controlled by a digital counter, implements different \(K_T\) gains that could be exploited to speed-up the locking transient.

From (3.4) we know that the greater the initial DCO period error, the longer the locking transient is. Looking at the equation that describe the the phase error trajectory \(\Delta t[k]=t_r[k] - t_d[k] \)

$$\begin{aligned} \Delta t [k] = k(T_{ref}-T_0) \pm \beta K_TN- \sum ^{k-1}_{i=0}(\alpha (\Psi _0 \pm i) \pm \beta ) K_TN + T_0 N \end{aligned}$$

we also know that the maximum time error in a locking transient is non-linear with respect to the initial DCO period error.

To detect the situation of a long locking transient we can insert an additional phase detector that indicates when the phase error is above a defined threshold. The idea is to use a phase detector with a dead zone. During the steady state the time error is inside the dead zone and the output is zero. Outside the dead zone the phase detector behaves as the BB-PD controlling, with a digital PI filter, a coarser DCO bank. Moreover, this additional path has a larger value of \(\beta _1 K_{T1}\) with respect to the main loop, thus allowing a fast transient without compromising the output phase noise. This path will be denoted as a frequency aid branch. The resulting architecture is reported in Fig. 3.5.

Fig. 3.5
figure 5

Nested loop PLL architecture: the frequency-aid branch is composed by a TPD with deadzone and a PI filter

Fig. 3.6
figure 6

Proposed PLL architecture: nested loop with feed-forward path and deadzone

As a first analysis of the model, we may conclude that the \(\beta _1 K_{T1}\) value does not have limits since this branch is disabled by the dead-zone in steady state condition. But in a practical implementation, as we have seen in the DCO segmentation design phase, \(K_T\) and \(K_{T1}\) are bonded together and this limits the effectiveness of this scheme. Moreover the presence of large quantization on the frequency-aid path due the discrete number of capacitor in DCO bank causes a large deviation from the standard trajectory and may lead to instability. The quantization and the finite dynamic range are taking into account by the model of Fig. 3.6 by the presence of the Q blocks and the limiter blocks. In order to reduce the influence of these two effects on the locking behavior of the loop and to set the gain for the frequency-aid without impacting on the DCO performance, it is possible to use an alternative control path that presents a quantization but has less effect on the locking transient.

For example, in a fractional-N PLL the divider is driven by a sigma delta modulator and the quantization error is cancelled-out by a digital to time converter (DTC) to keep the BB-PD in a random noise regime. The divider control word can handle integer and fractional divider N values with a small residual quantization in the loop. Controlling the fractional part of the divider FCW we can add or subtract a fractional part of the \(T_{DCO}\) period to the signal \(t_{d}\), acting like an intrinsic integrator in the DCO path. The modified scheme is depicted in Fig. 3.6. Thanks to this feed-forward path, when the time error \(\Delta t\) is larger than the frequency-aid phase detector deadzone, the same time error is immediately reduced by \(\lambda _1 T_{DCO}\) while the DCO frequency is adjusted by the coarser capacitive bank. If \(\lambda _1\) is properly sized to keep the phase jump inside half of the deadzone, at each locking-aid activation the trajectory will restart with a \(\Delta t \) around zero and DCO frequency will decrease or increased at each reference cycle.

Fig. 3.7
figure 7

Die microphotograph of implemented sub-6GHz PLL

4 Measurement results

The PLL described in this chapter has been fully integrated in a 65nm CMOS process (see the die photo in Fig. 3.7) and occupies a core area of \(0.61\,\text {mm}^2\). The measurements results were presented in [4].

The implemented frequency synthesizer for the sub-6GHz range generates an output sinusoidal signal from 3.59 to 4.05 GHz. The reference signal is generated by an integrated reference oscillator working with an external quartz reference of 52MHz. From the die microphotograph it is possible to identify the Class-B double tail resonators DCO as the main contributor to the active area. This technology stack does not implement thick or ultra-thick metal, and the high LC quality factor needed to satisfy the output phase noise mask is obtained by using a large width main inductor. The measured output frequency can be controlled from 3.59 to 4.05 GHz, equivalent to a tuning range of 12%. The flicker corner frequency is 60 kHz.

The analog blocks are placed in the space between the DCO and the crystal oscillator (XO in Fig. 3.7). These blocks are the CMOS programmable multi modulus divider (MMD), the digital to time converter (DTC) and the Bang-Bang Phase detector (a simple Flip-Flop) and they are implemented in a similar way to [1]. The 5-level Coarse TDC is instead realized with a cascade of delay cells and Flip-Flops.

The bang-bang Phase Detector, DTC and buffers are implemented in Current Mode Logic (CML), while the divider and TDC are in standard CMOS. The power supply rails of the CML and CMOS blocks are separate to avoid disturbance coupling and each one has a dedicated and integrated decoupling capacitive bank.

The total power dissipation is 5.28 mW leading to a FoM of \(-247.5\) dB. Figure 3.8 shows the measured phase noise. The RMS jitter (integrated from 1 kHz to 30 MHz) is 182.5 fs, while the spot phase noise at 20 MHz offset is −150.7 dBc/Hz. This noise satisfies the tight GSM specifications referred to a 900 MHz carrier of −162 dBc/Hz. The worst measured fractional spur is −50 dBc.

The locking-aid algorithm was tested by changing the divider control word and measuring the transient of the output frequency.

Fig. 3.8
figure 8

Measured output phase noise of implemented PLL at 3.8 GHz

Fig. 3.9
figure 9

Measurements of locking time with or without the locking-aid for large frequency steps: (a) without frequency aid technique, (b) with frequency aid technique

Figure 3.9 displays the transient response for coarse and fine frequency acquisition (a) disabling and (b) enabling the frequency aid technique. The frequency discriminator block was always enabled to guarantee at least the frequency locking. In the first row, a fine frequency step of 1 MHz is performed. Without the frequency aid branch the transient is heavily nonlinear and the target frequency is reached after 7 ms. Instead, with the two frequency aid branches active, the lock condition is achieved in just \(110\,\upmu \)s. In the second row of Fig. 3.9, a step of 364 MHz is performed.

Disabling the frequency-aid technique (while keeping the frequency discriminator block), the circuit is unable to reach lock due to cycle slips. Enabling the frequency aid technique the PLL locks in 5.6 \(\upmu \)s within 10 MHz from the final frequency value and takes 180 \(\upmu \)s to fall below 1 kHz. Comparison with the state of the art is presented in Table 3.1. The table includes only the published works at the time of the ISSCC submission of [4].

Table 3.1 Comparison table with other sub-6GHz DPLL

5 Conclusions

This chapter presented the design and the optimization of both integrated output jitter and locking time in digital frequency synthesizers. Due to the limited and non-linear characteristics of the BB phase detector, the locking transient time is a common issue in the digital BB-PD PLL architecture. The proposed new locking aid techniques are able to break the trade-off between loop bandwidth and locking time. With this scheme the BB digital frequency synthesizer is able to lock in \(110\upmu \)s for a 1 MHz frequency step and in \(115\,\upmu \)s for a 364 MHz frequency step without adding any look-up table or state machine at the system. The steady state jitter of the 3.7 GHz output signal is 182.5 fs and it is obtained by implementing an high-efficiency class B oscillator with double tail resonator. The overall power consumption of 5.28 mW from 1.2 V power supply leads to a power-jitter FoM of −247.5 dB.