1 Introduction

Sparse FFT methods can be used in many different applications, where it is a priori known that the resulting signal in time/space or frequency domain is sparse. Such algorithms have earned a considerable interest within the last years.

Many deterministic sparse FFT algorithms are based on combinatorial approaches or phase shift, see e.g. [1, 3, 6, 9, 10, 19]. These approaches usually need access to arbitrary values of a given function \(f(x) =\sum _{j=1}^M a_j \, {\mathrm e}^{2 \pi {\mathrm i} w_j x}\) assuming that the unknown frequencies \(w_j\) are in \([-N/2, N/2) \cap {\mathbb Z}\). The sparse FFT techniques in [8, 17] are based on Prony’s method.

By contrast, the deterministic algorithms proposed in [11, 13, 14, 16], or in [15], Section 5.4, consider the fully discrete problem, where for a given vector \({\mathbf{x}} \in {\mathbb C}^N\), we want to efficiently compute its discrete Fourier transform \({\hat{\mathbf{x}}}\) under the assumption that \(\hat{\mathbf{x}}\) is M-sparse or has a short support of length M. Recently, these techniques have also been transferred to derive sparse fast algorithms for the discrete cosine transform, [4, 5].

Problem statement Let \({\mathbf{x}} = (x_j)_{j=0}^{N-1} \in {\mathbb C}^N\) with \(N= 2^J\) for some \(J>1\). Further, let \({\mathbf{F}}_N :=(\omega _N^{jk})_{j,k=0}^{N-1} \in {\mathbb C}^{N \times N}\) with \(\omega _N := {\mathrm e}^{-2 \pi {\mathrm i}/N}\) denote the Fourier matrix of order N, and \({\mathbf{F}}_N^{-1} = \frac{1}{N} \overline{{\mathbf{F}}}_{N}\). We consider the following two scenarios, which can essentially be treated with the same algorithm.

  1. (a)

    Assume that \(\hat{\mathbf{x}} :={\mathbf{F}}_N \, {\mathbf{x}} = (\hat{x}_k)_{k=0}^{N-1}\) is given. How do we, in a sublinear way, determine \({\mathbf{x}}\) from \(\hat{\mathbf{x}}\), if it can be assumed that \({\mathbf{x}}\) is M-sparse with \(M^2 < N\)?

  2. (b)

    Assume that \({\mathbf{x}} \in {\mathbb C}^N\) is given. How do we, in a sublinear way, determine \(\hat{\mathbf{x}} = {\mathbf{F}}_N {\mathbf{x}}\) from \({\mathbf{x}}\), if it can be assumed that \(\hat{\mathbf{x}}\) is M-sparse with \(M^2 < N\)?

In both scenarios, M needs not to be known beforehand. However, if M is known, then this knowledge can be used to simplify the algorithm. Throughout the paper, we say that a vector \({\mathbf{x}}\) is M-sparse, if only M components have an amplitude that exceeds a predetermined small threshold \(\epsilon >0\).

This paper is organized as follows. In Sect. 2, we summarize the basic multi-scale idea of the algorithm used in [16] for the scenario (a). Section 3 is devoted to the extension of the method in [16]. First, we present the general pseudocode of the sparse FFT algorithm. The numerical stability of this algorithm mainly depends on the condition number of special Vandermonde matrices, which are used at each iteration step for solving a linear system with at most M unknowns. In Sect. 3.1 we give an estimate of the condition number of the occurring Vandermonde matrices, which are partial matrices of the Fourier matrix. This estimate is used in the sequel to determine the two free parameters determining the Vandermonde matrix. One parameter stretches the given nodes generating the Vandermonde matrix, and the second parameter determines the number of its rows. In Sect. 4 we briefly show, how the derived algorithm can be simply adapted to solve the sparse FFT problem (b). Finally, in Sect. 5 we present the large impact of the new approach that allows rectangular Vandermonde matrices. A Python implementation of the new algorithm is available under the link “software” on our homepage http://na.math.uni-goettingen.de.

2 Multi-scale Sparse Sublinear FFT Algorithm from [16]

We consider the problem stated in (a) to derive an iterative stable procedure to reconstruct \(\mathbf {x}\) from adaptively chosen Fourier entries of \(\hat{\mathbf{x}}\). To state the multi-scale algorithm from [16], we need to define the periodized vectors

$$\begin{aligned} \mathbf {x}^{(j)} = (x_{k}^{(j)})^{{2}^j-1}_{k=0} :=\Big (\sum _{{l}=0}^{2^{{J}-{j}}-1}x_{{k}+2^{{j}}{l}}\Big )_{k=0}^{2^{j}-1}\in \mathbb {C}^{2^{{j}}}, \qquad {j}=0,\ldots ,{J}. \end{aligned}$$
(1)

In particular, \(\mathbf {x}^{(J)}=\mathbf {x}\) and \(\mathbf {x}^{(0)}=\sum \limits _{k=0}^{{N}-1}x_{{k}}\) is the sum of all components \(\mathbf {x}\). Observe that, if the vector \(\hat{\mathbf{x}} = (\hat{x}_k)_{k=0}^{N-1}\) is known, then also the Fourier transformed vectors \(\hat{\mathbf{x}}^{(j)}\) are immediately known, and we have

$$\begin{aligned} \hat{\mathbf{x}}^{(j)} = {\mathbf{F}}_{2^j} {\mathbf{x}}^{(j)} = (\hat{x}_{2^{J-j}k})_{k=0}^{2^j-1} \end{aligned}$$

(see Lemma 2.1 in [13]). Throughout the paper, we assume that no cancellation appears in the periodic vectors, i.e., for each significant component \(|{x}_k| > \epsilon \) of \(\mathbf {x}\), \(k \in \{0, \ldots , N-1\}\), we have

$$\begin{aligned} |x_{k'}^{(j)}| > \epsilon \text { for all } {j}=0,\ldots ,{J}-1 , \qquad k' = k \, \mathrm {mod} \, 2^{j} \end{aligned}$$
(2)

for a fixed shrinkage constant \(\epsilon >0\). Condition (2) is for example satisfied if all components of \({\mathbf{x}}\) lie in one quadrant of the complex plane, e.g. \(\text {Re}\, x_{j} \ge 0\) and \(\text {Im} \, x_{j} \ge 0\) for \(j=1, \ldots , N-1\).

Idea of the algorithm The multi-scale algorithm in [16] iteratively computes \({\mathbf{x}}^{(j+1)}\) from \({\mathbf{x}}^{(j)}\), for \(j=j_0, \ldots , J-1\). If the sparsity M of \({\mathbf{x}}\) is unknown, then we start with \(j_0=0\) and \({\mathbf{x}}^{(0)} := \hat{x}_0 = \sum _{k=0}^N x_k\). If M with \(M^2 < N\) is known beforehand, then we fix \(j_0 = \lfloor \log _2 M \rfloor +1\) and compute

$$\begin{aligned} {\mathbf{x}}^{(j_0)}:= {\mathbf{F}}_{2^{j_0}}^{-1} \hat{\mathbf{x}}^{(j_0)} = \frac{1}{2^{j_0}} \overline{\mathbf{F}}_{2^{j_0}} \, (\hat{x}_{2^{J-j_0}k})_{k=0}^{2^{j_0}-1} \end{aligned}$$

using an FFT algorithm with complexity \(\mathcal {O}(j_0 \, 2^{j_0}) = \mathcal {O}(M\, \log M) \). At the j-th iteration step, we assume that \({\mathbf{x}}^{(j)} \in {\mathbb C}^{2^{j}}\) with sparsity \(M_{j}\) has already been computed. Then we always have \(M_j \le M\). For \(M_j^2 < 2^j\), the computation of \({\mathbf{x}}^{(j+1)}\) from \({\mathbf{x}}^{(j)}\) is based on the following theorem (see Theorem 2.2 in [16]).

Theorem 2.1

Let \(\mathbf {x}^{(j)}\), \(j=0,\ldots ,J-1\), be the vectors defined in (1) satisfying (2). Then, for each \(j=0,\ldots ,J-1\), we have: if \(\mathbf {x}^{(j)} \in \mathbb {C}^{2^j}\) is \(M_j\)-sparse with support indices \(0\le \ n_{1}<n_{2}<\ldots <n_{{M}_j} \le \ 2^{j}-1\), then the vector \(\mathbf {x}^{(j+1)}\) can be uniquely recovered from \(\mathbf {x}^{(j)}\) and \(M_{j}\) components \(\hat{x}_{k_{1}},\ldots ,\hat{x}_{k_{M_j}}\) of \(\hat{\mathbf {x}}=\mathbf {F}_{N}\, \mathbf {x}\), where the indices \(k_{1},\ldots ,k_{M_{j}}\) are taken from the set \(\{2^{J-j-1}(2l+1):l=0,\ldots 2^j-1\}\) such that the matrix

$$\begin{aligned} \mathbf {A}^{(j)}:=\left( \omega _{{N}}^{k_{p} n_{r}}\right) _{p,r=1}^{M_{j}} \end{aligned}$$
(3)

is invertible.

The proof of Theorem 2.1 is constructive. With the notation \({\mathbf{x}}^{(j+1)}= \left( \begin{array}{c} {\mathbf{x}}_{0}^{(j+1)}\\ {\mathbf{x}}_{1}^{(j+1)} \end{array} \right) \), i.e., \(\mathbf {x}_{0}^{(j+1)}:= \Big (x_{\ell }^{(j+1)}\Big )_{\ell =0}^{2^{j}-1} \) and \(\mathbf {x}_{1}^{(j+1)}:= \Big (x_{\ell }^{(j+1)}\Big )_{\ell =2^{j}}^{2^{j+1}-1} \), we have from (1)

$$\begin{aligned} \mathbf {x}^{(j)}=\mathbf {x}_{0}^{(j+1)}+\mathbf {x}_{1}^{(j+1)}. \end{aligned}$$
(4)

Thus, if \(\mathbf {x}^{(j)}\) is known, it suffices to compute \({\mathbf{x}}_0^{(j+1)}\), while \(\mathbf {x}_{1}^{(j+1)}\) then follows from (4). We can now use the factorization of the Fourier matrix \({\mathbf{F}}_{2^{j+1}}\) (see Equation (5.9) in [15]), and obtain

$$\begin{aligned} \left( \!\!\begin{array}{c} (\hat{x}_{2\ell }^{(j+1)})_{\ell =0}^{{2^j-1}}\\ (\hat{x}_{2\ell +1}^{(j+1)})_{\ell =0}^{{2^j-1}} \end{array}\!\!\! \right) = \left( \!\!\!\begin{array}{cc} {\mathbf{F}}_{2^j} &{} {\mathbf{0}}\\ {\mathbf{0}} &{} {\mathbf{F}}_{2^j} \end{array} \!\!\!\right) \! \left( \!\!\!\begin{array}{c} \mathbf {x}_{0}^{(j+1)} + \mathbf {x}_{1}^{(j+1)} \\ {\mathbf{W}}_{2^j} (\mathbf {x}_{0}^{(j+1)} - \mathbf {x}_{1}^{(j+1)}) \end{array} \!\!\!\right) = \left( \!\!\!\begin{array}{c} {\mathbf{F}}_{2^j} {\mathbf{x}}^{(j)} \\ {\mathbf{F}}_{2^{j}} \, {\mathbf{W}}_{2^j} (2 {\mathbf{x}}_0^{(j+1)} - {\mathbf{x}}^{(j)}) \end{array} \!\!\!\right) , \end{aligned}$$

where \({\mathbf{W}}_{2^j} := \text {diag} \, (\omega _{2^{j+1}}^0, \ldots , \omega _{2^{j+1}}^{2^j-1})\), and \({\mathbf{0}}\) denotes the zero matrix of size \(2^{j} \times 2^{j}\). Thus, we conclude

$$\begin{aligned} {\mathbf{F}}_{2^{j}} \, {\mathbf{W}}_{2^j} \Big ( 2{\mathbf{x}}_0^{(j+1)} - {\mathbf{x}}^{(j)} \Big ) = \Big ( \hat{x}_{2\ell +1}^{(j+1)}\Big )_{\ell =0}^{{2^j-1}}. \end{aligned}$$
(5)

Further, (4) together with (2) implies that \({\mathbf{x}}_0^{(j+1)}\) can only have significant entries for the same index set as \({\mathbf{x}}^{(j)}\), and we have to compute only these \(M_j\) entries. Introducing the restricted vectors

$$\begin{aligned} \tilde{\mathbf {x}}^{(j+1)}_{0}:= \left( x^{(j+1)}_{n_{r}}\right) _{r=1}^{M_{j}}\in \mathbb {C}^{M_{j}}, \ \ \ \tilde{\mathbf {x}}^{(j)}:= \left( x^{(j)}_{n_{r}}\right) _{r=1}^{M_{j}}\in \mathbb {C}^{{M_{j}}},\nonumber \end{aligned}$$

we can also restrict the matrix \({\mathbf{F}}_{2^{j}} \, {\mathbf{W}}_{2^j} \in {\mathbb C}^{2^j \times 2^j}\) in the linear system (5) to its \(M_j\) columns with indices \(n_r\). Finally, it suffices to restrict the system in (5) to \(M_j\) linear independent rows, and \({\mathbf{x}}_0^{(j+1)} \) can still be uniquely computed. Therefore a restriction \({\mathbf{A}}^{(j)} \in {\mathbb C}^{M_j \times M_j}\) of the product \({\mathbf{F}}_{2^{j}} \, {\mathbf{W}}_{2^j}\) can be chosen as

$$\begin{aligned} \mathbf {A}^{(j)}:= \left( \omega _{{2^j}}^{h_p n_{r}}\right) _{p,r=1}^{M_{j}} \, \text {diag} \left( \omega _{2^{j+1}}^{n_1}, \ldots ,\omega _{2^{j+1}}^{n_{M_j}} \right) . \end{aligned}$$
(6)

Here, the matrix \(\left( \omega _{{2^j}}^{h_p n_{r}}\right) _{p,r=1}^{M_{j}}\) is a restriction of \({\mathbf{F}}_{2^j}\) to the the rows \(0 \le h_1< h_2< \ldots < h_{M_j} \le 2^j\) and columns \(n_r\), \(r=1, \ldots , M_j\) corresponding to support indices of \({\mathbf{x}}^{(j)}\). The diagonal matrix is the restriction of \({\mathbf{W}}_{2^j}\) to the rows and columns \(n_r\). Comparison with (3) yields \(k_p = 2^{J-j-1}(2h_p+1)\), \(p=1, \ldots , M_j\). In Algorithm 2.3 in [16], Theorem 2.1 is applied to iteratively compute \({\mathbf{x}}^{(j+1)}\) from \({\mathbf{x}}^{(j)}\), if solving the restricted linear system

$$\begin{aligned} {\mathbf{A}}^{(j)} \Big ( 2\tilde{\mathbf{x}}_0^{(j+1)} - \tilde{\mathbf{x}}^{(j)} \Big ) = \Big ( \hat{x}_{2h_p+1}^{(j+1)}\Big )_{p=1}^{M_j} \end{aligned}$$
(7)

is cheaper than an FFT algorithm for vectors of length \(2^j\).

The further results in [16] focus on finding good choices of indices \((h_p)_{p=1}^{M_j}\) at each iteration step. Thereby, the paper restricts to matrices \(\mathbf {A}^{(j)}\) of the form

$$\begin{aligned} \mathbf {A}^{(j)}:= \left( \omega _{{2^j}}^{\sigma _j \, p \, n_{r}}\right) _{p=0,r=1}^{M_{j}-1,M_j} \, \text {diag} \left( \omega _{2^{j+1}}^{{n_1}}, \ldots \omega _{2^{j+1}}^{n_{M_j}} \right) , \end{aligned}$$
(8)

i.e., we choose \(h_{p+1} = \sigma _j p\) for \(p=0, \ldots , M_j-1\) and some parameter \(\sigma _{j} \in \{1, \ldots , 2^{j}-1\}\). The first matrix in the factorization (8) is a Vandermonde matrix generated by the roots of unity \(w_{2^j}^{\sigma _j n_r}\), \(r=1, \ldots , M_j\). The iterative algorithm which is based on Theorem 2.1 will be stable, if the linear system (7) can be efficiently computed in a stable way at each level \(j=j_0, \ldots , J\). Therefore, [16] tries to find parameters \(\sigma _j \in \{ 1, \ldots , 2^j-1\}\) such that

$$\begin{aligned} {\mathbf{V}}_{M_j}(\sigma _j): = \Big ( \omega _{2^j}^{\sigma _j \, p \, n_r} \Big )_{p=0,r=1}^{M_j-1, M_j} \end{aligned}$$

is invertible and has a good condition number. Observe that \({\mathbf{V}}_{M_j}(\sigma _j)\) is always invertible if we choose \(\sigma _j=1\). However, \(\sigma _j =1\) can lead to a very bad condition number of \({\mathbf{V}}_{M_j}(\sigma _j)\) and \(\mathbf {A}^{(j)}\), respectively.

Remark 2.2

Using Theorem 2.1, the reconstruction algorithm is based on the idea to iteratively compute periodizations \({\mathbf{x}}^{(j)} \in {\mathbb C}^{2^{j}}\) of \({\mathbf{x}} \in {\mathbb C}^{2^{J}}\) of growing length \(2^{j}\). At each iteration step, we rigorously exploit the sparsity of these vectors \({\mathbf{x}}^{(j)}\) and conclude from the support \(\{n_{1}, \ldots , n_{M_{j}} \}\) of \({\mathbf{x}}^{(j)}\) that the support set of \({\mathbf{x}}^{(j+1)}\) can only be a subset of \(\{n_{1}, \ldots , n_{M_{j}} \} \cup \{n_{1}+2^{j}, \ldots , n_{M_{j}}+2^{j} \} \). Therefore, the assumption (2) is crucial, since otherwise, not all support indices may be found.

If (2) is not satisfied and if the sparsity M of \({\mathbf{x}}\) is known beforehand, then the iteration would start by computing the periodization \({\mathbf{x}}^{(j_{0})}\) of length \(2^{j_{0}} > M\) directly, and we can compare the sparsity of \({\mathbf{x}}^{(j_{0})}\) with M to ensure that no cancellation appears. If the sparsity of \({\mathbf{x}}^{(j_{0})}\) is smaller than M, we could then employ a direct FFT algorithm to find the next periodizations \({\mathbf{x}}^{(j)}\), \(j >j_{0}\), until the sparsity of \({\mathbf{x}}^{(j)}\) is equal to M. The complexity of the algorithm would then increase and depends on the level, where the last cancellation appears. In the worst case, if cancellation appears already in \({\mathbf{x}}^{J-1}\), we would get the complexity of a usual FFT algorithm.

3 Extension of the Sparse FFT Algorithm

The main contribution of this paper is an extension of the algorithm proposed in [16], which tremendously improves the stability of that algorithm to make it really applicable.

We will stay with the iterative approach to compute \({\mathbf{x}}^{(j+1)} \in {\mathbb C}^{2^{j+1}}\) from the \(M_{j}\)-sparse vector \({\mathbf{x}}^{(j)} \in {\mathbb C}^{2^{j}}\) via (7) and (4), where we consider only matrices \({\mathbf{A}}^{(j)}\), which are given as a product of a Vandermonde matrix and a diagonal matrix (with condition number 1) as in (8), and we will also try to find a suitable parameter \(\sigma _j \in \{1, \ldots , 2^{j}-1 \}\) to improve the numerical stability of the system. The Vandermonde structure provides the advantage that the system in (7) can be solved with computational cost of \({\mathcal O}(M^2)\) (see, e.g., [7]).

We however do not insist on a square matrix as in [16], but allow the Vandermonde matrix factor to be a rectangular matrix with more rows than columns of the form

$$\begin{aligned} {\mathbf{V}}_{M_j',M_j}(\sigma _j): = \Big ( \omega _{2^j}^{\sigma _j \, p \, n_r} \Big )_{p=0,r=1}^{M_j'-1, M_j}, \qquad M_j' \ge M_j. \end{aligned}$$
(9)

We will choose the number of rows of the Vandermonde matrix \({\mathbf{V}}_{M_j',M_j}(\sigma _j)\) adaptively at each iteration step based on the obtained estimate of the condition number of \({\mathbf{V}}_{M_j',M_j}(\sigma _j)\), where

$$\begin{aligned} \kappa _{2}({\mathbf{V}}_{M',M}(\sigma )) := \frac{ \max _{{\mathbf{u}}\in \mathbb {C}^{M}, \Vert \mathbf {u}\Vert _{2}=1} \Vert \mathbf {V}_{M',M}(\sigma ) \, \mathbf {u} \Vert _{2}}{\min _{{\mathbf{u} }\in {\mathbb C}^{M}, \Vert \mathbf {u} \Vert _{2}=1} \Vert \mathbf {V}_{M',M}(\sigma )\, {\mathbf{u}}\Vert _{2}}. \end{aligned}$$
(10)

We start with presenting the general pseudo code for the case of unknown sparsity M. In the further subsections, we will particularly present, how the matrix \({\mathbf{A}}^{(j)}\) needs to be chosen, where we allow now a rectangular matrix. In Algorithm 3.1, we use the set notation \(I^{(j)} + 2^{j}:= \{n+2^{j}: \, n \in I^{(j)} \}\).

Algorithm 3.1

Sparse (inverse) FFT for unknown sparsity M

figure a

To determine the suitable matrix

$$\begin{aligned} {\mathbf{A}}^{(j)} ={\mathbf{V}}_{M_j',M_j}(\sigma _j) \, \text {diag} \left( \omega _{2^{j+1}}^{{n_1}}, \ldots \omega _{2^{j+1}}^{n_{M_j}} \right) , \end{aligned}$$

we have to find a well-conditioned Vandermonde matrix \({\mathbf{V}}_{M_j',M_j}(\sigma _j)\). Our procedure consists of two steps.

  1. 1)

    We compute a suitable parameter \(\sigma _j\) with \({\mathcal O}(M^2)\) operations.

  2. 2)

    We compute the number \(M_j'\) of needed rows in the Vandermonde matrix, to achieve a well-conditioned coefficient matrix in the system (11).

As seen already in [16], we can simplify the procedure of determining \({\mathbf{V}}_{M_j',M_j}(\sigma _j)\), if the number of significant entries \(M_j\) of \({\mathbf{x}}^{(j)}\) did not change in the previous iteration step, i.e., if \(M_{j-1}=M_j\). In this case, we can just choose \(\sigma _{j+1} := 2 \sigma _j\) and stay with the number of columns, i.e., \(M_{j}' := M_{j-1}'\) (see also Sect. 3.4).

3.1 Estimation of the Condition Number of \({\mathbf{V}}_{M_j',M_j}(\sigma _j)\)

It is crucial for our algorithm to have a good estimate of the condition number of \({\mathbf{V}}_{M_j',M_j}(\sigma _j)\). The condition number of \({\mathbf{V}}_{M_j',M_j}(\sigma _j)\) strongly depends on the minimal distance between its generating nodes \(\omega _{2^j}^{\sigma _j n_r}\). More precisely, we have the following theorem (see [12, 16] or Theorem 10.23 in [15]).

Theorem 3.2

Let \(0\le n_{1}<n_{2}<\ldots<n_{M_j}<2^{j}\) be a given set of indices. For a given \(\sigma _j \in \{1,\ldots ,2^{j}-1\}\) we define

$$\begin{aligned} d_{j} = d({\sigma _j}) := \min _{1 \le k<l \le M_j} \left( (\pm \sigma _j \, (n_{l}-n_{k})) \, \mathrm {mod}\, 2^{j} \right) \end{aligned}$$
(12)

as the smallest (periodic) distance between two indices \(\sigma _j \,{n_l}\) and \(\sigma _j \, {n_k}\), and assume that \(d_{j}>0\). Then the condition number \(\kappa _{2}(\mathbf {V}_{M_j',M_j}(\sigma _j))\) of the Vandermonde matrix \(\mathbf {V}_{M_j',M_j}(\sigma _j):= \Big ( \omega _{2^{j}}^{\sigma _j \, p \, n_r} \Big )_{p=0,r=1}^{M_j'-1, M_j}\) satisfies

$$\begin{aligned} \kappa _{2}(\mathbf {V}_{M_j',M_j}(\sigma _j))^2 \le \frac{M_j'+2^{j}/d_{j}}{M_j'-2^{j}/d_{j}}, \end{aligned}$$
(13)

provided that \(M_j'>\frac{2^{j}}{d_{j}}\).

However, this estimate cannot be used for square matrices, i.e., for \(M_{j} = M_{j}'\), and it is not very sharp for large \(M_j\). Indeed, if \(d_{j}=2^{j}/M_j\) which means that the values \(\sigma _j \, n_k\) are equidistantly distributed on the periodic interval \([0, 2^{j})\), then the square matrix \(M_j^{-1/2} \, \mathbf {V}_{M_j,M_j}(\sigma _j)\) (with \(M_j'=M_j\)) is orthogonal with condition number 1 (see [2]), while the estimate (13) cannot be applied. On the other hand, if \(M_j'=2^{j}\), then we can simply conclude that \(\mathbf {V}_{2^{j},M_j}(\sigma _j)^* \mathbf {V}_{2^{j},M_j}(\sigma _j) = 2^{j}\, {\mathbf{I}}_{M_j}\) such that we again achieve condition number 1, while (13) provides \(\frac{2^{j} (1+1/d_{j})}{2^{j}(1-1/d_{j})}\), which again fails for the worst case \(d_{j}=1\) completely. Therefore, we apply another estimate, which is a simple consequence of the Theorem of Gershgorin, and can be iteratively computed during the iteration steps. It is based on the following Theorem.

Theorem 3.3

Let \(0\le n_{1}<n_{2}<\ldots<n_{M_j}<2^{j}\) be a given set of indices, and assume that \(\sigma _{j}(n_{k}-n_{\ell }) \ne 0 \, \mathrm {mod} \, 2^{j}\). Further, let for all \(k=1, \ldots , M_j\), \(M_{j} \le M_{j}' \le 2^{j}\), and

$$\begin{aligned} S_k(\sigma _j) := \sum _{\genfrac{}{}{0.0pt}{}{\ell =1}{\ell \ne k}}^{M_j} \left| \frac{\sin \Big ( \frac{M_j' \pi }{2^{j}} \, \sigma _j \, (n_k- n_{\ell })\Big )}{\sin \Big (\frac{\pi }{2^{j}} \, \sigma _j \, (n_k- n_{\ell })\Big )} \right| . \end{aligned}$$
(14)

Then the condition number of the Vandermonde matrix \(\mathbf {V}_{M_j',M_j}(\sigma _j)\) in (9) is bounded by

$$\begin{aligned} \kappa _{2}(\mathbf {V}_{M_j',M_j}(\sigma _j))^2 \le \frac{M_j'+ \max _{k} S_k(\sigma _j)}{M_j'-\max _{k} S_k(\sigma _j)}. \end{aligned}$$
(15)

Proof

Considering the matrix product \({\mathbf{W}}:= \mathbf {V}_{M_j',M_j}(\sigma _j)^* \, \mathbf {V}_{M_j',M_j}(\sigma _j) \in {\mathbb C}^{M_j \times M_j}\), it follows for the components \(w_{k,\ell }\) of \({\mathbf{W}}\) that

$$\begin{aligned} w_{k,k} = \sum _{p=0}^{M_j'-1} \omega _{2^{j}}^{p \, \sigma _j \, (n_k - n_k)} = M_j', \qquad k=0, \ldots , M_{j}-1, \end{aligned}$$

and for \(k \ne \ell \) and \(\sigma _{j}(n_{k}-n_{\ell }) \ne 0 \, \mathrm {mod} \, 2^{j}\),

$$\begin{aligned} |w_{k,\ell }| = \Big |\sum _{p=0}^{M_j'-1} \omega _{2^{j}}^{p \, \sigma _j \, (n_k - n_{\ell })}\Big | = \Big | \frac{1-\omega _{2^{j}}^{M_j'\sigma _j (n_k - n_{\ell })}}{1- \omega _{2^{j}}^{\sigma _j (n_k - n_{\ell })}} \Big | = \Big | \frac{\sin \left( \frac{M_j' \pi }{2^{j}} \, \sigma _j (n_k- n_{\ell })\right) }{\sin \left( \frac{\pi }{2^{j}} \, \sigma _j (n_k- n_{\ell })\right) } \Big |. \end{aligned}$$

Thus, \(S_k(\sigma _j)\) is the sum of the absolute values of all non-diagonal components in the k-th row of \({\mathbf{W}}\). The Theorem of Gershgorin implies now that the maximal eigenvalue of \({\mathbf{W}}\) is bounded from above by \(M_j' + \max _{k} S_k(\sigma _j)\), and the smallest eigenvalue is bounded from below by \(M_j' - \max _{k} S_k(\sigma _j)\). \(\square \)

While the estimate (15) is quite simple to achieve, it is more accurate than (13). In particular, in the two special cases \(M_j'=M_j\), \(d_{j} = 2^{j}/M_j\) and \(M_j'=2^{j}\), \(d_{j}=1\), the estimate is sharp, and we obtain the true condition number 1.

For our computation of \(\sigma _{j}\) in Sect. 3.2, we will however simplify (14) and will consider instead an approximation of the upper bound of \({S}_k(\sigma _j)\),

$$\begin{aligned} \tilde{S}_k(\sigma _j) := \sum _{\genfrac{}{}{0.0pt}{}{\ell =1}{\ell \ne k}}^{M_j} \left| \frac{1}{\sin (\frac{\pi }{2^{j}} \, \sigma _j \, (n_k^{(j)}- n_{\ell }^{(j)}))} \right| \ge S_k(\sigma _j) \end{aligned}$$
(16)

which is not longer dependent on \(M_j'\). Note that \(\tilde{S}_k(\sigma _j)>2^{j}\) can appear, if \(\sigma _{j}\) is not well chosen.

3.2 Efficient Computation of \(\sigma _j\)

For a given set of indices \(0 \le n_1< n_2< \ldots< n_M < 2^j\) we want to find a suitable \(\sigma _j \in \{1, \ldots , 2^{j}-1 \}\) such that an approximation of \(\max _k \tilde{S}_k(\sigma _j)\) is minimal. More precisely, as shown in Algorithm 3.4, we compare different possible parameters \(\sigma \) by comparing the sums of four terms in the sum (16), where the largest term is always included.

We surely could just consider all possible sets \(\{ \sigma n_1, \ldots , \sigma n_M\}\) for \(\sigma \in \{1, \ldots , 2^{j}-1\}\), compute the maximal sum \(\tilde{S}_k^{(j)}(\sigma )\) and compare the results to find the optimal parameter \(\tilde{\sigma }_j\). However, this procedure is too expensive. To achieve a sparse FFT algorithm with the desired overall complexity of \({\mathcal O}(M^2 \log N)\), we can spend at most \({\mathcal O}(M^2)\) operations to find a suitable parameter \(\sigma _j\).

To avoid vanishing distances \(\pm \sigma _j(n_k - n_\ell ) \, \mathrm {mod} \, 2^{j}=0\) for all \(n_k \ne n_\ell \), we will only consider odd integers \(\sigma _j \ge 1\). We then have that \(2^{j}\) and \(\sigma _j\) are co-prime such that for each odd \(\sigma _j\) we at least achieve that \(\max _k \tilde{S}_k^{(j)}(\sigma _j)\) is bounded. As our numerical tests show that prime numbers are good candidates for \(\sigma _j\), we propose the following algorithm to determine \(\sigma _j\).

Algorithm 3.4

(Computation of \(\sigma _j\) if \(M_j > M_{j-1}\))

figure b

The most expensive step in Algorithm 3.4 is the sorting of \(M_{j}\) elements in \(\sigma I^{(j)}\), which can be done with \(M_{j} \log M_{j} \le M \log M\) operations. Since \(\Sigma \) contains \(K < M_{j}/\log _{2} M_{j}\) elements, the algorithm has a computational cost of \({\mathcal O}(M^2)\). Note, that we did not compute the complete sum \(\tilde{S}_k(\sigma )\) for all choices of \(\sigma \) in Algorithm 3.4. Instead, for fixed \(\sigma \), we search for an index \(\tilde{k}\) that provides the smallest (periodic) distance \(|\sigma (n_{\tilde{k}} - n_{\tilde{k}-1})| = \min _{k \ne \ell }|\sigma (n_{k} - n_{\ell })|\). This index \(\tilde{k}\) is a good candidate for \(\mathop {\mathrm {argmax}}_k \tilde{S}_k(\sigma )\). We then only compute the sum of the largest component and the neighboring component of \(\tilde{S}_{\tilde{k}}(\sigma )\) instead of the full sum, since \(\tilde{S}_{\tilde{k}}(\sigma )\) is mainly governed by these components.

Remark 3.5

Using Theorem 3.2 it is of course also possible to determine \(\sigma _j\) by comparing only the minimal distance \(d({\sigma })\) in (12) for all \(\sigma \in \Sigma \), and to choose \(\sigma \in \Sigma \) that maximizes this distance.

There are always enough odd prime numbers available in \([1, \, \frac{2^{j}}{2}]\), since \(M_j^2 < 2^{j}\) (see, e.g., [18]).

3.3 Determination of \(M_j'\)

Further, we need to fix the number of needed rows \(M_j' \ge M_j\) to ensure that the Vandermonde matrix \(\mathbf {V}_{M_j',M_j}(\sigma _j)\) is well conditioned. Employing Theorem 3.3, we consider \(M_{j}' = c \, M_{j}\) for a small set of integers c, e.g. \(c \in \{1, 2, 5\}\). Starting with \(c=1\), we compute \(\max _{k} S_{k}(\sigma _{j})\) in (14) with \({\mathcal O}(M_{j}^{2})\) operations, and check via (15) whether the condition number of \({\mathbf{V}}_{M_{j}',M_{j}}(\sigma _{j})\) is acceptable. If it is too large, we enlarge c.

Remark 3.6

We can also use the estimates in Theorem 3.2 for determining \(M_j'\). In this case, we simply fix \(M_j'\) such that

$$\begin{aligned} \left( \frac{M_j' + 2^{j}/d_{j}}{M_j'- 2^{j}/d_{j}} \right) ^{1/2} < C \end{aligned}$$

where C is a pre-determined bound for the condition number of \({\mathbf{V}}_{M_{j}',M_{j}}(\sigma _{j})\). However, this estimate usually leads to a strong overestimation of \(M_j'\).

In our numerical experiments we achieved good results with the simple bound

$$\begin{aligned} M_j' = c\, M_j \qquad \text {with} \qquad c :=\min \left\{ \left\lfloor \frac{2^j/M_j}{d_{j}}\right\rfloor , c_{\max }\right\} , \end{aligned}$$
(17)

where \(c_{\max }\) is usually an integer with \(c_{\max } \le 5\) (see Sect. 5). This setting can also be understood as a compromise for having a good condition number of the matrix \({\mathbf{A}}^{(j)}\) in the system (11) on the one hand and the computational cost to solve the linear system on the other hand. Using for example the QR decomposition algorithm in [7] for rectangular Vandermonde matrices of size \(c M_j \times M_j\), we obtain a complexity of \((5c+ \frac{7}{2})M_j^2 + {\mathcal O}(c M_j)\).

3.4 Choice of \({\mathbf{A}}^{(j)}\) if \(M_{j-1}=M_j\)

If \(M_j=M_{j-1}\), we apply the following Lemma which is an extension of Theorem 4.2 in [16].

Lemma 3.7

Let \(\sigma _{j-1}\) and \(M_{j-1}'\) be the parameters used in the Algorithm 3.1 to determine \({\mathbf{V}}_{M_{j-1}', M_{j-1}}(\sigma _{j-1})\) in the iteration step \(j-1\), where \(0<n_1^{(j-1)}<\ldots<n_{M_{j-1}}^{(j-1)}<2^{j-1}\) are the support indices of \(\mathbf{x} ^{(j-1)}\). Further, assume that we have found \({\mathbf{x}}^{(j)}\) with \(M_j = M_{j-1}\), and support indices \(0<n_1^{(j)}<\ldots<n_{M_{j}}^{(j)}<2^{j}\). Then we can simply choose \(\sigma _j:=2 \sigma _{j-1}\) and \(M_j' := M_{j-1}'\) to achieve a Vandermonde matrix \({\mathbf{V}}_{M_{j}', M_{j}}(\sigma _{j})\) for iteration step j of Algorithm 3.1. With this choice, \({\mathbf{V}}_{M_{j}', M_{j}}(\sigma _{j})\) coincides with \({\mathbf{V}}_{M_{j-1}', M_{j-1}}(\sigma _{j-1})\) up to possible permutation of columns. In particular, we have

$$\begin{aligned} \kappa _2({\mathbf{V}}_{M_{j}', M_{j}}(\sigma _j)) = \kappa _2({\mathbf{V}}_{M_{j-1}', M_{j-1}}(\sigma _{j-1})). \end{aligned}$$

Proof

If \(M_j=M_{j-1}\), then it follows that \(n_r^{(j)} \in \{n_r^{(j-1)}, n_r^{(j-1)}+2^{j-1}\}\) for all \(r=1, \ldots , M_{j-1}\). With \(\sigma _j=2\sigma _{j-1}\) we obtain

$$\begin{aligned} \sigma _j \, n_r^{(j)} \, \mathrm {mod} \, 2^j = 2\sigma _{j-1}n_r^{(j)} \, \mathrm {mod} \, 2^j=2\sigma _{j-1}n_r^{(j-1)} \, \mathrm {mod} \, 2^j. \end{aligned}$$

Thus, for \(p=1, \ldots , M_j'\) (with \(M_j'=M_{j-1}'\)),

$$\begin{aligned} \omega _{2^j}^{\sigma _j (p-1) n_r^{(j)}} = \omega _{2^j}^{2 \sigma _{j-1} (p-1) n_r^{(j)}} = \omega _{2^j}^{2 \sigma _{j-1} (p-1) n_r^{(j-1)}} = \omega _{2^{j-1}}^{\sigma _{j-1} (p-1) n_r^{(j-1)}}. \end{aligned}$$

Hence, \({\mathbf{V}}_{M_{j-1}', M_{j-1}}(\sigma _{j-1})\) and \({\mathbf{V}}_{M_{j}', M_{j}}(\sigma _{j})\) have the same columns, and may differ only due to a different ordering of columns. In other words, there is an \(M_j \times M_j\) permutation matrix \({\mathbf{P}}_{M_j}\), such that \({\mathbf{V}}_{M_{j}', M_{j}}(\sigma _{j}) = {\mathbf{V}}_{M_{j-1}', M_{j-1}}(\sigma _{j-1}) {\mathbf{P}}_{M_j}\). In particular, the two matrices have the same condition number. \(\square \)

This observation implies that there will be no extra effort to compute the matrix \({\mathbf{A}}^{(j)}\) at all iteration steps j, where the sparsity \(M_j\) has not changed compared to \(M_{j-1}\).

4 The Direct Sparse FFT Algorithm

We consider now the direct sparse FFT problem stated in (b) in Sect. 1. For given \({\mathbf{x}} \in {\mathbb C}^N\), we want to determine \({\mathbf{y}} := \hat{\mathbf{x}} = {\mathbf{F}}_N \, {\mathbf{x}}\), assuming that \( {\mathbf{y}}\) possesses unknown sparsity M. We will show that our Algorithm 3.1 can be transferred to this problem.

First, we observe that the Fourier matrix satisfies the property

$$\begin{aligned} {\mathbf{F}}_N^{-1} = \frac{1}{N} \overline{\mathbf{F}}_N = \frac{1}{N} {\mathbf{J}}_N' \, {\mathbf{F}}_N \end{aligned}$$

(see Equation (3.34) in [15]), where \({\mathbf{J}}_N' := (\delta _{(j+k) \, \mathrm {mod}\, N})_{j,k=0}^{N-1}\) is the so-called flip matrix with \(({\mathbf{J}}_N')^{-1} = {\mathbf{J}}_N'\). Here, \(\delta _j\) denotes the Kronecker symbol, i.e., \(\delta _j =0\) for \(j\ne 0\) and \(\delta _j=1\) for \(j=0\). Thus, the relation \({\mathbf{x}} = {\mathbf{F}}_N^{-1} \, {\mathbf{y}}\) is equivalent to

$$\begin{aligned} {\mathbf{w}} := N \, {\mathbf{J}}_N' {\mathbf{x}} = {\mathbf{F}}_N {\mathbf{y}}. \end{aligned}$$

In other words, if we replace the given vector \({\mathbf{x}}\) by \({\mathbf{w}}\) in Algorithm 3.1, then \({\mathbf{w}}\) is the given Fourier transform of the desired vector \({\mathbf{y}}\), and we can apply Algorithm 3.1 directly to compute \({\mathbf{y}}\).

5 Numerical Experiments

First, we present some numerical experiments showing that the algorithm in [16] for sparsity \(M>20\) is no longer reliable. We generate randomly chosen sets of support indices \(I_M \subset \{0, \ldots , 2^{15}- 1\}\) with different cardinalities \(M=20,30,\ldots ,100\), and randomly choose values \(x_k\) for \(k \in I_M\) in double precision arithmetics. Then we apply our Algorithm 3.1, where access to the Fourier transform of \(\mathbf{x} \in \mathbb {C}^{2^J}\) is provided. While \(\sigma _j\) is optimally chosen as a prime number according to Algorithm 4.5 in [16], we only consider square Vandermonde matrices (as in [16]), i.e., we set \(c_{\max }=1\). We compare the output index set \(I_{out}\) with the generated set \(I_M\) of indices and count the failures of 100 tests for each M. The results are presented in Fig. 1. The test shows that the algorithm starts to be unreliable for sparsity \(M >20\).

Fig. 1
figure 1

Error rate in percentage for the computed set of indices for \(c_{max}=1\) and \(J=15\)

Table 1 Average condition number for \(c_{max}=1\) after 20 tests
Table 2 Average condition number for \(c_{max}=2\) (left) and \(c_{max}=5\) (right) after 20 tests

We now run the test with the same input data as above, but used the criteria in (17) with \(c_{\max }=2\). For any \(M=20,30,\ldots ,100\), no failures occur for the computed set of indices \(I_{out}\), i.e., we always find \(I_M= I_{out}\). Even if we run the tests for \(M=200\), the error rate is still zero.

To understand this strong effect when the number of rows of the Vandermonde matrix is enlarged, we analyze the condition numbers of the Vandermonde matrices occurring in the computations for different values \(c_{\max }\). We generate sets \(I_M\) of indices and randomly choose the amplitudes of components of \({\mathbf{x}}\) with support \(I_M\). For Algorithm 3.1, we provide access to the Fourier transformed vector \(\hat{\mathbf{x}}\) as an input as before for the tuples (JM) with \(J=15,16,\ldots ,22\), and \(M=20,30,40,50\). In this experiment, we vary \(c_ {\max } \in \{1,2,5\}\). In each test we compute the average over all condition numbers of the used Vandermonde matrices and repeat this 20 times for each tuple (JM). Finally, we take the mean of all the 20 averages, and obtain the results given in the Tables 1 and 2. The results in Table 1 show that a suitable choice of the parameter \(\sigma _j\), as applied in [16], is not sufficient to ensure moderate condition numbers of the Vandermonde matrices involved in the sparse FFT algorithm for \(M \ge 20\).

In Table 2, we provide some further condition numbers for larger numbers M of significant vector entries up to \(M=200\) and \(N=2^{15},\dots ,2^{22} \). The experiments show that \(c_{\max } =2\), i.e., doubling the number of rows in the matrix \({\mathbf{A}}^{(j)}\), is usually sufficient for \(M\le 100\). For \(M>100\), we need to take a larger \(c_{\max }\).

Now, we investigate how the runtime of the Algorithm depends on \(c_{\max }\). In Fig. 2 we present the average runtime for 20 tests with randomly chosen sparse vectors with sparsities \(M=10,30\) and for \(c_{\max }=1, c_{\max }=5,c_{\max }=20\). As we see in Fig. 2, our modifications have only a very small effect on the runtime. Finally, in Fig. 3 we compare the runtime of the Python implemented FFT numpy.fft.fft of length \(2^J\) with our algorithm for \(c_{\max }=20\). We can see, that our current Python implementation starts to be faster than the FFT for \( M \le 30\) and \(N \ge 2^{20}\). It is available under the link “software” on our homepage http://na.math.uni-goettingen.de.

Fig. 2
figure 2

Runtime comparison of the Algorithmus 3.1 for \(c_{max}=1\) (green), \(c_{max}=5\) (blue), \(c_{max}=20\) (red) for \(M=10\) (solid line) and \(M=30\) (dashed line) for length \(N=2^J\) with \(J=10,\ldots ,24\). (Color figure online)

Fig. 3
figure 3

Runtime comparison of Algorithmus 3.1 and \(c_{\max }=20\) (red) and the FFT (gray) for \(M=10\) (solid line) and \(M=30\) (dashed line) for length \(N=2^J\) with \(J=10,\ldots ,24\). (Color figure online)

6 Conclusions

In this paper, we have presented a modification of the sparse FFT algorithm in [16], which is based on the assumption that the wanted vector \({\mathbf{x}} \in {\mathbb C}^{N}\) with \(N= 2^{J}\) is M-sparse, and that the components of the discrete Fourier transform \(\hat{\mathbf{x}}= {\mathbf{F}}_{N}\, {\mathbf{x}}\) are available. Our proposed algorithm has the complexity \({\mathcal O}(M^{2} \log N)\) and is sublinear in N for small M. As in [16], the reconstruction of \({\mathbf{x}}\) is based on an iterative reconstruction of \(2^{j}\)-periodizations of \({\mathbf{x}}\) for \(j=0, \ldots , J\). At each iteration step, one needs to solve an equation system of size \({\mathcal O}(M)\), where the coefficient matrices are governed by Vandermonde matrices which are submatrices of the Fourier matrix \({\mathbf{F}}_{2^{j}}\). Differently from [16], we have considered rectangular Vandermonde matrices, and we have presented efficient methods to determine these matrices in dependence of two parameters, which both have a huge impact on the condition number. The first parameter \(\sigma _{j}\) changes the nodes \(\omega _{2^{j}}^{n_{\ell }}\), \(\ell =1, \ldots , M_{j}\) determining the Vandermonde matrix to \(\omega _{2^{j}}^{\sigma _{j} n_{\ell }}\). Here \(M_{j}\le M\) denotes the found sparsity of \({\mathbf{x}}^{(j)}\). The second parameter \(M_{j}' \ge M_{j}\) denotes the number of rows in the Vandermonde matrix. One ingredient to determine suitable parameters \(\sigma _{j}\) and \(M_{j}'\) is the new estimate for the condition number of the occurring Vandermonde matrices in Theorem 3.3. As shown in the numerical experiments, the presented modification of the sparse FFT algorithm makes it applicable also for larger sparsity values M while the original algorithm in [16] started to be unreliable already for \(M>20\).