1 Introduction

The Hilbert transformation is widely used in communication receivers, in radar and sonar systems for image rejection and instantaneous frequency/phase estimation by generating the analytic signal or complex signal envelope. Also for the design of optimal minimal-phase FIR filters the digital Hilbert transform has been employed [7]. One application in a radio receiver is shown in Fig. 1: the radio frequency (RF) or intermediate frequency (IF) signal x with center frequency \(f_0\) is sampled with sampling rate \(f_s\). The Hilbert transformation \({\mathcal {H}}\) performs a \(90^{\circ }\) phase shift of the signal where

$$\begin{aligned} {\hat{x}}[k] = x[k] + j\,{{\mathcal {H}}}\{x[k]\} \end{aligned}$$
(1)

is referred to as the analytic signal or envelope and j the imaginary unit. For analytic signals the spectrum vanishes for negative frequencies, i.e., \({\hat{X}}(f) \equiv 0,\, \forall f<0\). This enables a flexible design of multi-standard receivers if the filters are sufficiently broadband and have (approximately) linear phase in the passband. The analytic signal is digitally quadrature modulated to baseband with normalized angular frequency \(\Omega _0 = \omega _0/\omega _s\) where \(\omega _0\) is the angular modulation, and \(\omega _s\) the sampling frequencies. For the modulation frequency \(\Omega _0 = \frac{\pi }{2}\), i.e., half the Nyquist rate, the modulator is moreover multiplier-less. Hence the only bottleneck of this design is the realization of the Hilbert transformation.

Both a variety of analog and digital Hilbert transformers were proposed, covering different frequency ranges. Broadband digital Hilbert transformation filters [5, 13, 17, 18] enable a huge flexibility for the design for multi-standard receivers. However, they are difficult to design with low effort since the high image rejection requirements demand for a tiny deviation from the ideal \(90^{\circ }\) phase shift. Moreover, the group delay shall be constant in the passband, what can only be achieved exactly by FIR Hilbert filters [13, 14], which require however a high filter order. In [14] Bernstein polynomials have been employed to achieve maximally flat FIR filters.

Fig. 1
figure 1

Receiver generating the analytic signal followed by quadrature mixing to baseband. \(x_{HF,IF}\) are either the radio signal or the signal at the intermediate frequency, \(\Omega _0\) the angular modulation frequency and \(x_I/x_Q\) the inphase and quadrature phases, respectively

IIR filters have in contrary a much lower filter order. Unfortunately standard designs [5, 17, 18] exhibit large group delay variations, making them less suited for communication transceivers. In [5] an elliptic filter and in [17, 18] a Chebychev approximation has been employed. Another block-wise technique for calculating the analytic signal employs the FFT with successive filtering in the frequency domain. This method is however very time-consuming. In recent time, a novel IIR design technique based on fractional derivatives and swarm optimization has been proposed [1,2,3]. This technique has also been applied successfully for the design of Hilbert transformers [4, 11].

This papers deals with the design and implementation of nearly linear phase IIR Hilbert filters based on the wave digital filter (WDF) technique [8, 16]. The filter design is based either on a collocation or Galerkin approach [21]. They are realized as halfband filters, reducing the implementation costs.

The hardware realization is multiplier-less employing a Canonical Signed Digit (CSD) quantization of the filter coefficients [6, 10, 15, 19, 20] In the frequency range of interest the IIR filters are moreover approximately linear phase and hence of nearly constant group delay. The VHDL implementation has been tested with normally distributed and uncorrelated random numbers as input signal. Furthermore, a bit true SystemC model is at hand. A complete top-down procedure from filter design to hardware realization is discussed.

2 Design of Digital Hilbert Transformers

Wave digital filters [8, 9] are comprised of two parallel allpass filters. We consider therefore two allpass IIR filters with transfer functions \(H_1(z)\) and \(H_2(z)\), combined in a vector

$$\begin{aligned} H(z) = \begin{pmatrix} H_1(z)&H_2(z) \end{pmatrix}^T \end{aligned}$$
(2)

where the \(H_i(z),\, i=1, 2\) are rational and monic polynomials, \(a_0=b_N=1\), of order N, i.e.,

$$\begin{aligned} H_i(z) = \frac{\sum _{m=0}^N b_{i,m}\,z^{-m}}{\sum _{n=0}^N a_{i,n}\,z^{-n}}, \quad b_{i,N-n} = a_{i,n} \end{aligned}$$
(3)

with allpass condition \(b_{i,N-n} = a_{i,n},\,i=1,2\). The magnitude responses of the transfer functions of allpass filters along the unit circle \(z=e^{j\,\Omega }\) fulfill \(\left| {H_1\left( e^{j\,\Omega }\right) } \right| = \left| {H_2\left( e^{j\,\Omega }\right) } \right| =1\) For the phase responses the \(90^{\circ }\) phase difference for Hilbert transformers, i.e.,

$$\begin{aligned} \Delta \varphi&= \textrm{arg}\left\{ H_1\left( e^{j\,\Omega }\right) \right\} - \textrm{arg}\left\{ H_2\left( e^{j\,\Omega }\right) \right\} = {\left\{ \begin{array}{ll} -\pi /2 &{} \Omega > 0 \\ \pi /2 &{} \, \Omega < 0 \end{array}\right. } \end{aligned}$$
(4)

must hold. Exploiting symmetries, similar to the design of halfband filters, the odd filter coefficients vanish identically, i.e., \(a_{i,2n+1}=b_{i,2n+1}\equiv 0,\; n \in \mathbb {N}\). The phase responses are calculated either by a Galerkin or a collocation method [21] as shown as follows:

Let \(H_2(z) = z^{-(N-1)}\) be a simple tapped delay line with phase \(\varphi _2(\Omega ) = -(N-1)\,\Omega \). Then the reference phase of allpass filter one, \(\varphi _{1,\text {ref}}\), becomes

$$\begin{aligned} \varphi _{1,\text {ref}}(\Omega ) = \varphi _2(\Omega ) + \Delta \varphi \end{aligned}$$
(5)

This phase function is approximated by an allpass filter (3) of degree N.

2.1 Problem Formulation

Let \({\textbf{b}} = [b_0,\dots ,b_N]^T\) be the column vector of the \(N+1\) unknown nominator filter coefficients of dimension \(N+1\). Since for allpass filters the coefficients of the denominator are in reversed order, i.e., \(b_{N-n} = a_n\), only the vector \({\textbf{b}}\) must be calculated.

The allpass filter shall fulfill a prescribed phase characteristic with property

$$\begin{aligned} \varphi _{\text {ref}}(\Omega ) = -\varphi _{\text {ref}}(-\Omega ) \end{aligned}$$
(6)

according to (5) such that the condition

$$\begin{aligned} H_1\left( e^{j\Omega } \right) - e^{j\varphi _{\text {ref}}(\Omega )} = 0 \end{aligned}$$
(7)

shall hold. A re-calculation employing (3) leads to the equation

$$\begin{aligned} G\left( e^{j\Omega } \right) := \sum _{n=0}^N b_n\,e^{-j\,n\,\Omega } - e^{j\varphi _{\text {ref}}(\Omega )} \, \sum _{n=0}^N b_{N-n}\,e^{-j\,n\,\Omega } = 0 \end{aligned}$$
(8)

which will be approximated by either a Galerkin or collocation method.

2.2 Galerkin Approach

We search for the best approximation of (8) in a subspace spanned by trigonometric basis functions. The coefficients are calculated employing a Galerkin method. To this end, we define the inner product

$$\begin{aligned} \left\langle {f,\,g} \right\rangle := \frac{1}{2\pi } \int _{-\pi }^\pi f \,g^*\, \textrm{d}\Omega \end{aligned}$$
(9)

with induced norm \(\Vert f\Vert _2 = \sqrt{ \left\langle {f,\,f} \right\rangle }\), where the asterisk represents complex conjugation.

The Galerkin method calculates the unknown coefficients by requiring that the inner product vanishes for a suitable set of test functions \(\varphi _l\left( e^{j\Omega } \right) , \,l=0,\dots ,N\), i.e.,

$$\begin{aligned} \left\langle {G,\,\varphi _l} \right\rangle = 0, \quad l=0,\dots ,N \end{aligned}$$
(10)

Moreover, the test functions must span an \(N+1\) dimensional subspace. For numerical stability, it is common practice to employ orthogonal basis functions, i.e., \( \left\langle {\varphi _l,\,\varphi _m} \right\rangle = \delta _{lm}\), where \(\delta _{lm}\) is the Kronecker symbol. A suitable choice are periodic Fourier basis functions \(\varphi _l = e^{-j\,l\,\Omega }\).

Calculating the inner product \( \left\langle {G,\,\varphi _l} \right\rangle \), one can see from the first summation term in (8) that

$$\begin{aligned} \left\langle { \sum _{n=0}^N b_n\,e^{-j\,n\,\Omega },\, \varphi _l } \right\rangle = b_l, \quad \ell \in \mathbb {Z}\end{aligned}$$
(11)

due to the orthogonality property of the basis functions. For the second summation term in (8) we obtain

$$\begin{aligned}&\left\langle { e^{j\varphi _{\text {ref}}(\Omega )}\,\sum _{n=0}^N b_{N-n}\,e^{-j\,n\,\Omega },\, \varphi _l } \right\rangle = \frac{1}{2\pi } \int _{-\pi }^\pi e^{j\varphi _{\text {ref}}(\Omega )}\,\sum _{n=0}^N b_{N-n}\,e^{-j\,(n-l)\,\Omega }\,\textrm{d}\Omega \end{aligned}$$
(12)

Introducing the short hand \(\psi _{nl}(\Omega ) = \varphi _{\text {ref}}(\Omega ) - (n-l)\,\Omega \) with property \(\psi _{nl}(-\Omega ) = -\psi _{nl}(\Omega )\), one can rewrite the integral above using well-known trigonometric identities to

$$\begin{aligned} \frac{1}{\pi } \int _{0}^\pi \sum _{n=0}^N b_{N-n}\, \cos {(\psi _{nl}(\Omega ))}\,\textrm{d}\Omega =: g_l({\textbf{b}}), \quad l=0,\dots , N \end{aligned}$$
(13)

One obtains \(N+1\) equations for the \(N+1\) unknows. Defining the vector \({\textbf{g}}({\textbf{b}}) = [g_0,\dots ,g_N]^T\), one obtains the linear homogeneous equation

$$\begin{aligned} {\textbf{b}} - {\textbf{g}}({\textbf{b}}) = 0 \end{aligned}$$
(14)

i.e., the solution lies in the kernel of (14). The kernel of the homogeneous equation can be calculated in a numerically stable way, e.g., by the singular value decomposition (SVD) or the QR algorithm.

2.3 Collocation Method

We define the inner product with weight function w

$$\begin{aligned} \left\langle {f,\,g} \right\rangle _w:= \frac{1}{2\pi } \int _{-\pi }^\pi w\, f \,g^*\,\textrm{d}\Omega \end{aligned}$$
(15)

where \(w(\Omega )>0\) (up to some countable numbers in the interval \([-\pi ,\,\pi [\)). Choosing a suitable weighting function w enforces the accuracy of the approximation at specific frequencies. As test functions we employ Dirac delta distributions \(\varphi _l = \delta (\Omega - \Omega _l),\, l = 0,\dots ,L\ge N+1,\, \Omega _l \in [-\pi ,\,\pi [\). Employing the sift property of the delta distribution one obtains

$$\begin{aligned}&\left\langle {G,\,\varphi _l} \right\rangle _w\,=\, w(\Omega _l)\,\sum _{n=0}^N b_n\,e^{-j\,n\,\Omega _l} -w(\Omega _l)\,e^{j\varphi (\Omega _l)} \, \sum _{n=0}^N b_{N-n}\,e^{-j\,n\,\Omega _l} \end{aligned}$$
(16)

Collecting all equations in a system of linear homogeneous equations, we get \( \left[ \left\langle {G,\,\varphi _0} \right\rangle _w,\dots , \left\langle {G,\,\varphi _L} \right\rangle _w\right] ^T = 0 \) For \(L>N+1\) is the system of equations over-determined. One obtains the solution with least Euclidean norm employing either the SVD or the QR algorithm.

3 Implementation

Implementation of WDF filters, w.r.t. robustness in the presence of quantization, probability of over-/underflow etc., has been widely studied, see the representative works [8, 16]. First, the allpass filters are realized in cascades of 1st. and 2nd. order filters sections, depending on whether the zeros/poles are real or conjugate complex. Second, the allpass filters are realized by so called reflection matrices which ensures stability, low sensitivity w.r.t. quantization and low implementation costs. We combine this well approved implementation technique for WDF filters with a multiplier-free realization based on the CSD technique discussed below.

3.1 Cascade of First and Second Order Blocks

The transfer function \(H_2(z) = z^{-(N-1)}\) is a simple tapped delay line with linear phase \(\varphi _2(\Omega ) = -(N-1)\,\Omega \) and \(H_1(z)\) the Nth order allpass filter which approximates \(\varphi _{1,\text {ref}}(\Omega )\) as discussed in the previous section. This allpass filter \(A(z):= H_1(z)\) is realized in cascaded form [9]

$$\begin{aligned} A(z) = \prod _i A^{(i)}(z) \end{aligned}$$
(17)

where the \(A^{(i)}(z)\) are either first or second order allpass filters, namely

$$\begin{aligned} A^{(i)}(z)&= {\left\{ \begin{array}{ll} \frac{-\gamma _i + z^{-1}}{1 - \gamma _i \, z^{-1}}\\ \frac{-\gamma _{i0} + \gamma _{i1}\,(\gamma _{i0}-1)\, z^{-1} + z^{-2}}{1 + \gamma _{i1}\,(\gamma _{i0}-1)\, z^{-1} -\gamma _{i0}\, z^{-2}} \end{array}\right. } \end{aligned}$$
(18)

depending on if the poles and zeros of A(z) are real or complex conjugate. The parameter values of \(\gamma \) lie in the range \(-1 \le \gamma < 1\) which guarantees stability of the filter. The \(\gamma \) coefficients are often referred to as reflection parameters [8, 9]. In [9] Gaszi addressed the problem of the prevention of limit cycles and discussed four variants of implementation depending on the parameter value of the \(\gamma \) coefficient. It was shown that following the design rules in [9] there are no overflows and limit cycles for a sinusoidal excitation at any frequency under steady-state conditions. To this end, Gaszi introduced the so called adaptor model with input signal vector \((a_1\, a_2)^T\) and output vector \((b_1\, b_2)^T\), related by the memory-less system

$$\begin{aligned} \begin{pmatrix} b_1\\ b_2 \end{pmatrix} = \begin{pmatrix} -\gamma &{} 1 + \gamma \\ 1 - \gamma &{} \gamma \end{pmatrix} \, \begin{pmatrix} a_1\\ a_2 \end{pmatrix} \end{aligned}$$
(19)

Its four variants of implementation require only a single multiplication each [9].

3.1.1 First-Order Allpass Filter Section

We consider a real pole and zero with 1st order transfer function

$$\begin{aligned} A^{(i)}(z) = \frac{b_1(z)}{a_1(z)} = \frac{-\gamma + z^{-1}}{1 - \gamma \, z^{-1}} \end{aligned}$$
(20)

Setting \(a_2 = z^{-1}\,b_2\) one obtains from (19) after a short calculation the first-order allpass filter stage (20).

3.1.2 Second-Order Allpass Filter Section

A second-order filter is realized by two first order adaptor sections in series [9]. The signal waveforms are marked by \((')\) or \(('')\) for the front and rear circuitry. The latter or rear first order stage is equivalent with the first order section above with transfer function given by (20) where \(\gamma \) is substituted by \(\gamma _0\), i.e.,

$$\begin{aligned} A^{('')}(z) = \frac{b^{('')}_1(z)}{a^{('')}_1(z)} = \frac{-\gamma _0 + z^{-1}}{1 - \gamma _0 \, z^{-1}} \end{aligned}$$
(21)

The two adaptors are connected via the signal connects \(a^{(')}_2 = b_1^{('')}\) and \(a^{('')}_1 = z^{-1}\,b_2^{(')}\). The transfer function of the series then reads with coefficient \(\gamma _1\) for the front adaptor stage

$$\begin{aligned} A^{(i)}(z) = \frac{b_1^{(')}(z)}{a_1^{(')}(z)} = \frac{-\gamma _{0} + \gamma _{1}\,(\gamma _{0}-1)\, z^{-1} + z^{-2}}{1 + \gamma _{1}\,(\gamma _{0}-1)\, z^{-1} -\gamma _{0}\, z^{-2}} \end{aligned}$$
(22)

3.2 Multiplier-less Canonical Signed Digit Implementation

The filter coefficients are quantized employing the Canonical Signed Digit (CSD) representation [6, 10, 15, 19, 20]. The CSD representation is more flexible than two’s complement and is therefore of advantage for a multiplier-less, high throughput and power saving implementation of the filter, since the filter coefficients can be generally realized by considerably less nonzero digits without loss of accuracy. Therefore it can reduce the complexity of the hardware. CSD employs a ternary number representation, i.e., \(\{0,\,\pm 1\}\), with a least number of nonzero digits. In [19, 20] a reduction of nonzero digits of about 33% has been reported compared with a two’s complement representation. The signals are represented in two’s complement in standard manner. In [15] the CSD coefficients are further optimized by scaling.

First, the transfer function H(z) of the Hilbert transformer is reformulated as a cascade of 1st- and 2nd-order filters stages (17) of form (18) employing the adaptor model (19).

Second, the filter coefficients \(\gamma \) of the 1st- and 2nd-order stages are CSD quantized, i.e.,

$$\begin{aligned} \gamma = \sum _{\ell =1}^{B} a_\ell \,2^{-\ell },\quad a_\ell \in \{0,\,\pm 1\} \end{aligned}$$
(23)

i.e., the coefficients lie in \(\gamma \in [-1+2^{-B},\, 1-2^{-B}]\).

In [12] a mapping algorithm from two’s complement to CSD has been proposed. Here, an optimization toolbox has been developed which represents the filter coefficients by the smallest number of nonzero coefficients \(a_\ell \in \{0,\,\pm 1\}\) while fulfilling the design requirements. The signals are represented in a standard manner by two’s complement.

4 Results

The Hilbert transformers have been realized in VHDL and SystemC. The test signal is a statistical independent (white) normally distributed noise signal with zero mean (\(\mu =0\)) and standard deviation \(\sigma =0.25\). Employing the CSD representation of the filter coefficients, the filters are realized multiplier-less. The test input signal is a white Gaussian noise signal with standard deviation \(\sigma \). Figs. 2 and 4 depict the magnitude responses of the filter transfer function \(H(z) = H_1(z) + j\,H_2(z)\), both with floating point and CSD quantized coefficients, which is ideally unity for \(\Omega > 0\) and zero for \(\Omega < 0\), if the \(\pm \pi /2\) phase deviation in (5) is valid. This corresponds to a image rejection ratio \(\left| {H\left( e^{j\,\Omega }\right) } \right| /\left| {H\left( e^{-j\,\Omega }\right) } \right| ,\,\Omega \in [0,\pi ]\) of infinity. Hence the magnitude response of \(\left| {H\left( e^{j\,\Omega }\right) } \right| \) and the derived image rejection ratio is the primary figure of merit (FOM) of the Hilbert transformer. It is also an indirect measure of the phase error of (5).

4.1 Filter Order \(N=6\)

Fig. 2 depicts the magnitude response, both float and quantized, of an \(N=6\)th order Hilbert transformation quantized by a bitwidth of \(B_d=10\) Bits. The WDF implementation requires 3 adaptors. This filter is realized by only 14 dash operations. The image rejection ratio is \(>50\) dB for about \(60\%\) of the bandwidth.

Fig. 2
figure 2

\(H(z) = H_1(z) + j\,H_2(z)\) with filter order \(N=6\) and CSD quantized by a wordlength of \(B_d=10\) versus float

The VHDL implementation has been simulated with normally distributed random numbers. Fig. 3a depicts the filter response for a decimal bitwidth of \(B_d=8\) and Fig. 3b for \(B_d=10\) Bit, respectively.

Fig. 3
figure 3

Power spectral density of an analytic signal with normallydistributed random numbers as input, \(\mu =0\), \(\sigma =0.25\), decimal bitwidth \(B_d=8\) bit (left) and \(B_d=10\) bit (right)

4.2 Filter Order \(N=10\)

Fig. 4 depicts the magnitude response, both float and quantized, of an \(N=10\)th order Hilbert transformation quantized by a decimal bitwidth of \(B_d=10\) Bits. The WDF implementation requires 5 adaptors. This filter is realized by only 32 dash operations. The image rejection ratio is \(>50\) dB for about \(70\%\) of the spectrum.

Fig. 4
figure 4

\(H(z) = H_1(z) + j\,H_2(z)\) with filter order \(N=10\) and CSD quantized by a wordlength of \(B_d=10\) versus float

The VHDL code has been simulated with the same normally distributed input signal as described in the previous section. Fig. 5a depicts the filter response for a decimal bitwidth of \(B_d=8\) and Fig. 5b for \(B_d=10\) Bit.

Fig. 5
figure 5

Power spectral density of an analytic signal with normally distributed random numbers as input, \(\mu =0\), \(\sigma =0.25\), decimal bitwidth \(B_d=\) bit (left) and \(B_d=10\) bit (right)

4.3 Synthesis for Intel Cyclone V FPGA

The design utilizes five integer and eight decimal bits and has been synthesized with Intel Quartus Prime Lite 18.1 for the Intel Cyclone V FPGA. Table 1 depicts the maximum clock frequency and the resource utilization of adaptive logic modules (ALM), adaptive look-up tables (ALUT) and registers for the Hilbert transformers with orders \(N=6\) and \(N=10\). The \(N=6\)th order Hilbert transformer can be operated with a maximum clock frequency of approximately 55.42 MHz. This results in a Nyquist frequency of 27.71 MHz. The bandwidth of the image frequency rejection is approximately \(60\%\) of the Nyquist frequency, which is 16.6 MHz.

For the \(N=10\)th order Hilbert transformer, the maximum clock frequency is 44.47 MHz, which results in a Nyquist frequency of 22.23 MHz.

The bandwidth of the image frequency rejection is approximately \(80\%\) of the Nyquist frequency, which is 17.78 MHz. The Table 1 summarizes the results for both exemplary filter designs.

Table 1 Synthesis results of Hilbert transformers for the Intel Cyclone V FPGA: Adaptive Logic Module (ALM), Adaptive Look-Up Table (ALUT)

4.4 Conclusion

This paper presents a novel technique for the design of broadband and nearly linear phase digital Hilbert transformers for generating, e.g., the analytic signal and its implementation in VHDL. The method is based on the wave digital filter technique. The two transfer functions are rational polynomials and match the \(90^{\circ }\) phase difference requirement excellently, even at low filter orders, crucial for a high image rejection. The design method employs either a Galerkin or a collocation technique.

We obtained excellent image rejection capabilities of \(>50\) dB even at moderate filter orders and practically constant group delays in the passband. Exploiting symmetries, the odd filter coeffcients of the transfer functions vanish uniquely, resulting in a highly efficient hardware realization.

The filter coefficients are optimally quantized employing the Canonical Signed Digit representation. The optimized CSD representation outperforms significantly the quantization by two’s complement both in accuracy and implementation costs. The CSD representation of the filter coefficients is implemented multiplier-less which enables low power and high-speed signal processing.

The authors developed, besides a design and quantization toolbox, parameterizable VHDL and SystemC models. A MATLAB function for automatically generating a VHDL package containing the filter parameters has also been implemented. Synthesis has been performed for the Intel Cyclone V FPGA with Intel Quartus Prime Lite 18.1. Hence a seemless top-down design flow from filter design to VHDL synthesis has been realized.