1 Introduction

Scientists investigate nature by collecting diverse types of data. They then infer the underlying rules by modeling and analyzing the recorded data. Time series is a commonly encountered data type. Its time-evolving nature paves the way for scientists to access the system’s dynamics. Time–frequency (TF) analysis is a powerful time series analysis tool, which captures non-stationary oscillatory dynamics and serves as a portal to the underlying system.

During the past 70 years, several TF analysis methods were developed [25], which can be classified into three types: linear, quadratic, and nonlinear. Linear-type transforms, such as the short time Fourier transform (STFT) and the continuous wavelet transform (CWT), have been widely studied. They are subject to the limitation of the uncertainty principle associated with the CWT or the STFT [25, 26, 44]. Quadratic-type transforms, such as the Wigner–Ville distribution and Cohen class, could provide a more adaptive analysis of the input signal. However, they suffer from severe mode mixing artifacts [25]. There are several nonlinear-type transforms including: the reassignment method (RM) [3, 11] and its variations, the TF by convex optimization (Tycoon) [33], the Blaschke decomposition (BKD) [18, 19], the empirical mode decomposition (EMD) [30], the iterative filtering [17], the sparsification approach [29], the approximation approach [16], the TF jigsaw puzzle (TFJP) for the Gabor transform (GT) [32, 43], the non-stationary GT (NSGT) [5], the matching pursuit [37], and several others. The variations of RM include the synchrosqueezing transform (SST) [22, 53], the synchrosqueezed wave packet transform [54], the synchrosqueezed S-transform [31], the second-order SST [41], the concentration of frequency and time (ConceFT) [23], and the de-shape SST [36]. While the approaches vary from algorithm to algorithm, the common goal of nonlinear-type transforms is to obtain a “sharpened” TF representation (TFR) that could provide more accurate dynamical information underlying the recorded time series. We refer interested readers to [23] for a more extensive literature survey and their applications.

The nonlinear-type transforms can be classified into two categories. The first category consists of transforms that do not require choosing a window, like the BKD, the EMD, and the Tycoon. While the EMD has been widely applied, its application to data analysis needs more attention due to its lack of mathematical foundation. The BKD, on the other hand, is solidly supported by the complex analysis theory. However, there are still several mathematical challenges left unsolved, and the application of the BKD to data analysis is still in its infancy. The Tycoon is a synthesis-based approach to estimate the TFR with the sparsity constraint based on the convex optimization. While it theoretically has the potential to achieve a sharp TFR, it is currently compute-intensive.

The second category consists of transforms that depend on a chosen window, which can be classified into two subcategories: the reassignment-type and non-reassignment-type. The reassignment-type subcategory includes the RM and its variations, and the non-reassignment-type subcategory includes the other algorithms. While different methods are subject to different limitations, they are all limited by the window selection problem. The question is: what is the optimal window when we analyze a given time series? In the ideal situation, the optimal window should be universal and always provides the optimal results under some constraints. However, it is widely believed that there probably is no optimal window due to the complicated nonlinearity hidden inside the natural signals. To resolve this issue, different methods provide different solutions. For example, in the reassignment-type transforms, we could theoretically prove that when the signal and window satisfy some regularity conditions, the algorithms are adaptive to the signal, in the sense that the dependence on the window is negligible; see, for example, [22, Theorem 3.3]. However, in practice, the situation might be more complicated. Therefore, the performance of the algorithm is not guaranteed. Thus, how to determine the optimal window for nonlinear time series is a crucial issue.

In this paper, we aim to alleviate this window selection issue for the reassignment-type transforms. We consider the Rényi entropy to determine the optimal window. By applying the optimal window width, the TFR sharpness can be enhanced while the reconstruction routine of the SST and its variations can be preserved. We specifically consider a window that is optimal for a chosen TF analysis, if the distribution of the associated TFR is highly concentrated. While there are several ways to measure the distribution concentration, we apply the Rényi entropy [6, 20, 45], which has been shown to efficiently estimate the signal information content and complexity in the TFR.

The article is organized as follows. Section 2 summarizes the background material, including the adaptive harmonic model (AHM) describing an oscillatory signal composed of multiple components and several reassignment-type TF analysis tools that could be applied to analyze such signals. Section 3 describes a scheme to optimize the performance of the reassignment-type TF analyses by window width selection techniques. A comparison of the proposed scheme and some non-reassignment-type transforms is also provided. Numerical results and an application to the attosecond physics are reported in Sect. 4. A conclusion is drawn in Sect. 5.

2 Background

In this section, we summarize the AHM to quantify oscillatory signals and review several recently proposed TF analysis tools suitable for analyzing signals satisfying the AHM. While the review could be extended to other reassignment-type transforms, such as the RM, the de-shape SST and the ConceFT, we only review the SSTFootnote 1 and the second-order SST in this section.

2.1 Adaptive harmonic model

The AHM aims to describe the time-varying oscillatory dynamics in a given signal. Suppose that the signal x(t) is composed of finite \(K\ge 1\) oscillatory functions; that is, \(x(t)=\sum _{k=1}^K f_k(t)\), where \(f_k\) is the k-th oscillatory function and \(k=1,\ldots ,K\). The kth oscillatory function \(f_k\) is composed of an amplitude modulation (AM) \(a_k(t)\), which is positive, and a phase function \(\phi _k(t)\), which is strictly monotonically increasing, so that \(f_k(t)=a_k(t)\cos (2\pi \phi _k(t))\), for \(k=1,\ldots ,K\). The \(\phi '_k(t)\) is thus positive and is regarded as the instantaneous frequency (IF) of the k-th oscillatory function. In this study, we consider only real oscillatory signals, since most time series we acquire in the real world are real.

While such a AHM describes a signal composed of multiple oscillatory functions, it is too general to work with and we need some constraints. Fix \(\epsilon \ge 0\). Let the positive constant c be the supremum of the variation of the IF function; that is, \(\Vert \phi _k''\Vert _{\infty }\le c\) for \(k=1,\ldots ,K\). It is also assumed that the variation of the AM is controlled by the IF; that is, \(|a_k'(t)|\le \epsilon \phi '_k(t)\) for all time \(t\in {\mathbb {R}}\) and \(k=1,\ldots ,K\). We call an oscillatory function satisfying these constraints an intrinsic mode type (IMT) function. Assume that the smallest frequency gap between two adjacent IMT components is d, and \(d>0\), for all time \(t\in {\mathbb {R}}\). That is, \(\phi '_k(t)-\phi '_{k-1}(t)>d\), for \(k=2,\ldots ,K\). In practice, we assume \(\epsilon <1\) and is small enough so that the AM is slowly varying. The function satisfying the above conditions is said to be the generalized AHM for the signal, and the constants \(\epsilon ,c,d\) are model parameters.

2.2 STFT

The STFT of a tempered distribution x with respect to a chosen window G in the Schwartz space is defined by

$$\begin{aligned} {{V}}_{x}^{G}(u,\eta )= \int _{-\infty }^{\infty } x(t)G(t-u)e^{-i2\pi \eta (t-u)} \, \mathrm {d}t, \end{aligned}$$
(1)

where \(u\in {\mathbb {R}}\) is the time and \(\eta \in {\mathbb {R}}^+\) is the frequency.

2.3 SST

The SST can be embedded in different linear-type transforms, such as the CWT [22], the wave packet [54], or the S-transform [31]. Here we only mention the SST embedded in STFT due to the page limit. The SST with the resolution \(\kappa >0\) and the threshold \(\gamma \ge 0\) is defined by

$$\begin{aligned} {{S}}_{x}^{G,\kappa ,\gamma }(u,\xi )=\int _{A_{x,\gamma }(u)} {{V}}_{x}^{G}(u,\eta )\frac{1}{\kappa }h\Big ( \frac{|\xi -\omega ^\gamma _x(u,\eta )|}{\kappa } \Big ) \, \mathrm {d}\eta , \end{aligned}$$
(2)

where \(u\in {\mathbb {R}}\) is the time, \(\xi >0\) is the frequency, \(A_{x,\gamma }(u):=\left\{ \eta \in {\mathbb {R}}_+:\left| {V}^{G}_x (u,\eta )\right| \ge \gamma \right\} , h(t)=\frac{1}{\sqrt{\pi }} e^{-t^2}, \kappa >0\) and \(\omega _x(u,\eta )\) is the reassignment rule:

$$\begin{aligned} \omega ^\gamma _x (u,\eta )=\left\{ \begin{array}{ll} \frac{-i\partial _u{{V}}_{x}^{G}(u,\eta )}{2\pi {{V}}_{x}^{G}(u,\eta )} &{}\quad \text{ when } |{V}_{x}^{G}(u,\eta )|\ge \gamma \\ -\infty &{}\quad \text{ when } |{{V}}_{x}^{G}(u,\eta )|<\gamma . \end{array} \right. \end{aligned}$$
(3)

The TFR determined by the STFT is sharpened by reassigning its coefficient at \((u,\eta )\) to a different point \((u,\xi )\) according to the reassignment rule. The SST is clearly nonlinear in nature. It is important to note that the reassignment rule primarily depends on the phase information of the STFT, which contains the IF information. According to the theoretical analysis in [41, 53], the TFR of the SST is concentrated only on the IFs of all oscillatory components when the IF’s of IMT functions in x(t) are slowly varying.

While the SST algorithm looks complicated at the first glance, the idea underlying the algorithm is intuitive. Take a harmonic function \(x(t)=Ae^{i2\pi \xi _0t}\) into account. Choose the window function G that satisfies \({\hat{G}}\) is a real function and \({\hat{G}}(\xi )\ge \gamma \) when \(\xi \in [-{\Delta },{\Delta }]\), where \(\gamma >0\) is chosen small enough and \({\Delta }>0\). Note that x(t) is an IMT function. The STFT of x(t) could be directly calculated by the Plancherel theorem, and we have \(V^{G}_x(u,\eta )=A{\hat{G}}(\eta -\xi _0)e^{i2\pi \xi _0 u}\). The information we have interest in an oscillatory signal, the IF, is hidden in the phase of \(V^{G}_x(u,\eta )\). An intuitive idea to obtain the IF in this case is first apply the logarithm function on \(V^{G}_x(u,\eta )\), next divide it by \(i2\pi \), and then apply the derivative according to u when \(|{\hat{G}}(\eta -\xi _0)|\ge \gamma \); that is, \(\frac{d}{i2\pi du}\big [\log (A{\hat{G}}(\eta -\xi _0))+i2\pi \xi _0 u\big ]=\xi _0\). Clearly, this operator is equivalent to the reassignment rule; that is,

$$\begin{aligned} \partial _u\frac{\log \left[ V^{G}_x(u,\eta )\right] }{i2\pi }=\frac{-i\partial _u{{V}}_{x}^{G}(u,\eta )}{2\pi {{V}}_{x}^{G}(u,\eta )} \end{aligned}$$
(4)

when \(|{\hat{G}}(\eta -\xi _0)|\ge \gamma \). We choose \(\frac{-i\partial _u{{V}}_{x}^{G}(u,\eta )}{2\pi {{V}}_{x}^{G}(u,\eta )}\) to estimate the IF since we do not need to worry about the phase unwrapping problem when applying the logarithm function to a complex function. To continue, note that we have \(-i\partial _u V^G_x(u,\eta )=2\pi \xi _0V^{G}_x(u,\eta )\) by a direct calculation. Hence, \(\omega ^\gamma _x(u,\eta )=\xi _0\) when \(\eta \in [\xi _0-{\Delta },\xi _0+{\Delta }]\) and \(\omega ^\gamma _x(u,\eta )=-\infty \) otherwise. For this signal, we have \(A_{x,\gamma }(u)=[\xi _0-{\Delta },\xi _0+{\Delta }]\), and the reassignment rule indicates that the IF is \(\xi _0\). Thus, the SST of x can then be computed by the following equation:

$$\begin{aligned}&S^{G,\kappa ,\gamma }_x(u,\xi ) \nonumber \\&\quad =e^{i2\pi \xi _0 u} \int _{\xi _0-{\Delta }}^{\xi _0+{\Delta }} {\hat{G}}(\eta -\xi _0) \frac{1}{\kappa }\frac{1}{\sqrt{\pi }}e^{-|\xi -\xi _0|^2/\kappa ^2}d\eta \nonumber \\&\quad =Ce^{i2\pi \xi _0 u}\frac{1}{\kappa }e^{-|\xi -\xi _0|^2/\kappa ^2}, \end{aligned}$$
(5)

where \(C=\frac{1}{\sqrt{\pi }}\int _{-{\Delta }}^{{\Delta }} {\hat{G}}(\eta ) d\eta \approx \frac{1}{\sqrt{\pi }}G(0)\). Clearly, when \(\kappa \) is small, for each \(u, S^{G,\kappa ,\gamma }_x(u,\xi )\) is concentrated around \(\xi _0\), which help alleviate the smearing effect in the STFT caused by the uncertainty principle.

2.4 Second-order SST

When the IF is not slowly varying, the sharpening ability of the SST might be deteriorated. The second-order SST resolves this problem by taking the second-order information in the phase of the STFT to correct the reassignment rule. The second-order SST could be viewed as a combination of the SST and the RM—its sharpening ability is similar to that of the RM, and it allows us to reconstruct IMT functions like the SST. There are at least two versions of second-order SST. We discuss the vertical SST (vSST) and the oblique SST (oSST) [41]. Both the vSST and the oSST depend on the second-order reassignment rule, which is a correction of the reassignment rule \({\omega }^\gamma _x\) in (3):

$$\begin{aligned} {\hat{\omega }}^\gamma _x (u,\eta )= \left\{ \begin{array}{ll} \omega ^\gamma _x (u,\eta ) + c(u,\eta )\left( u-{\hat{t}}_x(u,\eta )\right) &{}\quad \text{ when } {\partial _{\eta } {\hat{t}}_x}(u,\eta )\ne 0 \\ {{\omega }^\gamma _x} (u,\eta ) &{}\quad \text{ otherwise }, \end{array}\right. \nonumber \\ \end{aligned}$$
(6)

where \(u\in {\mathbb {R}}\) is the time, \(\eta >0\) is the frequency, and

$$\begin{aligned} {{{{\hat{t}}}_x}(u,\eta ) = u + i\frac{\partial _\eta {{V}}_{x}^{G}(u,\eta )}{{{V}}_{x}^{G}(u,\eta )}~ \text{ and }~ c(u,\eta )=\frac{\partial _t{{\omega }^\gamma _x} (u,\eta )}{\partial _\eta {\hat{t}}_x (u,\eta )}.} \end{aligned}$$
(7)

The vSST with the resolution \(\kappa >0\) and the threshold \(\gamma \ge 0\) is defined by

$$\begin{aligned} {{vS}}_{x}^{G,\kappa ,\gamma }(u,\xi )=\int _{A_{x,\gamma }(u)} {{V}}_{x}^{G}(u,\eta )\frac{1}{\kappa }h\Big ( \frac{|\xi -{\hat{\omega }}^\gamma _x(u,\eta )|}{\kappa } \Big ) \, \mathrm {d}\eta ; \end{aligned}$$
(8)

the oSST with the resolution \(\kappa >0\) and \(\tau >0\) and threshold \(\gamma \ge 0\) is defined by

$$\begin{aligned} {{oS}}_{x}^{G,\kappa ,\gamma }(u,\xi )&=\iint {{V}}_{x}^{G}(y,\eta )e^{i\pi (2\xi -c(y,\eta )(\tau -y))(\tau -y)}\nonumber \\&\quad \times \frac{1}{\kappa }h\Big ( \frac{|\xi -{\hat{\omega }}^\gamma _x(y,\eta )|}{\kappa } \Big )\frac{1}{\tau }h\Big ( \frac{|u-{{{\hat{t}}}_x}(y,\eta )|}{\tau } \Big ) \,\nonumber \\&\quad \mathrm {d}\eta \mathrm {d}y. \end{aligned}$$
(9)

Note that the vSST could be viewed as a direct generalization of the SST with the modified reassignment rule, while the oSST could be viewed as a mixture of the SST and the RM. The reader is referred to [41] for details of the second-order SST and [8] for its theoretical analysis.

2.5 IMT function reconstruction

Each IMT function [22] can be reconstructed from the SST, as well as the vSST, if the input signal \(x(t)=\sum _{k=1}^Kx_k(t)\) satisfies the AHM. Take the SST as an example. Each IMT function \(x_k=a_k(t)\cos (2\pi \phi _k(t)), k \in \{1,...,K\}\), can be reconstructed by the following two steps. First, evaluate the “complexification” of the k-th IMT function by

$$\begin{aligned} {\hat{x}}^{{\mathbb {C}}}_k(t) = \frac{1}{G(0)}\int _{\hat{{\mathcal {Z}}}_{k}(t)} {\tilde{S}}^{\kappa ,\gamma }_{G,x}(t,\xi )\mathrm{d}\xi , \end{aligned}$$
(10)

where \({\hat{{\mathcal {Z}}}_{k}}(t)=[{\hat{\phi }}'_k(t)-\epsilon ^{1/3},{\hat{\phi }}'_k(t)+\epsilon ^{1/3}]\) and \({\hat{\phi }}'_k(t)\) is the estimated IF of the k-th IMT function, which can be obtained by the ridge extraction algorithm [9, 12, 38]. Then, the k-th IMT function is then extracted by

$$\begin{aligned} {\hat{x}}_k(t)=\mathfrak {R}{\hat{x}}^{{\mathbb {C}}}_k(t), \end{aligned}$$
(11)

where \(\mathfrak {R}\) is the operator taking the real part of the input complex value. The reconstruction formula (10) could serve as an approach to obtain the complex form of a real signal. This property is important since, in general, evaluating the complex form of an IMT function is a non-trivial issue. It is opted that to successfully obtain the imaginary counterpart of \(x_k(t)\) and \(a_k(t)\sin (2\pi \phi _k(t))\) via the Hilbert transform, there are several constraints for the spectra of \(a_k(t)\) and \(\cos (2\pi \phi _k(t))\). We refer the reader with interest to [7, 40] for details.

3 Time-varying optimal window widths

It has been well known that a short window is helpful for analyzing a signal with fast-varying IF components. On the other hand, for signals with two IMT functions with close IFs, the window should be long enough to avoid spectral overlaps. An “optimal” window should provide a balance between these two facts. However, the uncertainty principle [26, 44] suggests that the benefits of a short and a long window width cannot be attained simultaneously. In this regard, we need a method to choose a proper window width dynamically to balance on both ends.

Several attempts have been proposed in the literature to balance between different window bandwidths. For example, in [32, 43], the TFJP was proposed to select the optimal window for the GT based on the Rényi entropy [20]; in [4], the NSGT depends on a frame associated with a non-uniform grid on the TF plane, which comes from the information provided by the signal. The frame could be viewed as the “optimal window” for the GT. These approaches have been shown to be helpful in the audio processing [32], for example, the beat tracking problem [28]. In general, these approaches could be understood as the TF tiling or a dictionary learning problem—for a chosen redundancy, how to provide the best tiling of the TF plane, or to choose the optimal frame, so that the TF representation is “optimal” based on a chosen criterion, for example, the minimal \(\ell ^1\) norm [24] or the minimal Rényi entropy.

The reassignment-type transforms could be viewed as an approach to solve the dictionary learning problem by taking the phase of the STFT into account. Note that the STFT could be viewed as evaluating the coefficients of a signal associated with an infinitely redundant dictionary

$$\begin{aligned} {\mathcal {D}}=\{G(t-\cdot )e^{i2\pi \xi t}\}_{t\in {\mathbb {R}},\xi \in {\mathbb {R}}^+}, \end{aligned}$$
(12)

where G is the chosen window. Directly determining the optimal frame out of \({\mathcal {D}}\) is not an easy task. Instead of determining the optimal frame, the reassignment rule used in the RM and the SST and its variations could be viewed as an alternative to approximate the optimal frame out of \({\mathcal {D}}\). Note that in the SST (2), the vSST (8), and the oSST (9), the coefficients of the STFT are moved to a new location based on the reassignment rule. In this sense, nonlinear-type TF analysis could be viewed as evaluating the coefficients of an approximated optimal frame. We mention that this viewpoint has been taken into account to design the Tycoon algorithm [33]. Theoretically, if the signal satisfies the AHM model, it has been shown that the reassignment rule could lead to the optimal frame [12, 22, 41]. However, due to the lack of knowledge of the model parameters, like \(\epsilon ,c,d\) of a given signal, the reassignment rule, and hence the TFR, might be influenced by the interaction of the chosen window and the time-varying AM and IF, and the overlap of spectra of different oscillatory components. In practice, although we have a rule of thumb of how to choose the window based on the a priori knowledge of the signal, the reassignment rule might deviate from the optimal frame.

In order to resolve this issue, we propose an adaptive way to determine the optimal window for the reassignment-type transforms. This approach can be viewed as correcting the approximated optimal frame determined by the reassignment rule. A window is regarded as optimal for a chosen reassignment-type TF analysis if it provides the most concentrated TFR. Since the IF and AM of each IMT function may vary from time to time, a single window optimal for the entire signal might not be suitable. Therefore, the notion of the optimal window for a chosen TF analysis should be local. For example, for each time, we determine an optimal window.

In general, finding the optimal window is a difficult task. In statistics, the problem is commonly reduced to the window bandwidth selection problem [52]. In this work, we simplify the window selection problem to the window bandwidth selection problem. To further simplify the discussion, we consider the Gaussian window, that is,

$$\begin{aligned} G(t)={g_\sigma }(t) := \frac{1}{\sqrt{2\pi }\sigma } e^{-t^2/(2\sigma ^2)}, \end{aligned}$$
(13)

where \(\sigma >0\) is the bandwidth of the window. In this case, the STFT is the same as the GT. In this section, for a chosen TF analysis with the Gaussian window (13), we describe a time-varying optimal window width (TVOWW) selection scheme and an adaptive optimal window width (AOWW) selection scheme to compute a series of local optimal window widths. We mention that although we focus on the window bandwidth selection problem with the Gaussian window, the discussion below could be directly generalized to other window functions or even multiple window functions.

3.1 The TVOWW and the AOWW selection schemes

First select a reassignment-type transform, for example, the SST. The TVOWW selection scheme evaluates the local window width by iterating the following steps for each time \(u\in {\mathbb {R}}\):

  1. 1.

    Evaluate the distribution concentration of the TFR on \([u-b,u+b]\times {\mathbb {R}}^+\), where \(b\ge 0\) determines the size of the neighborhood, by a chosen distribution concentration measure, denoted as \(\mathrm C_{\sigma ,b}(u)\).

  2. 2.

    The local optimal window width at the time instant u is determined by

    $$\begin{aligned} {\tilde{\sigma }}_b(u):=\text {argmin}_{\sigma >0}\mathrm C_{\sigma ,b}(u). \end{aligned}$$
    (14)
  3. 3.

    Apply the window width \({\tilde{\sigma }}_b(u)\) to evaluate the TFR of the signal x(t) at time u.

The proposed scheme could be directly applied to other TF analyses, such as the STFT, the second-order SST, or other nonlinear TF analyses. When \(b=\infty , {\tilde{\sigma }}_b\) is a constant value and the SST is reduced to the original SST with one window width, which is chosen to optimize the selected measure of distribution concentration. We regard this special case the global optimal window width (GOWW).

The AOWW selection scheme evaluates the local window width via iterating the following steps for a given pair of time and frequency, \((u,\xi )\).

  1. 1.

    Evaluate the distribution concentration of the TFR on \([u-b,u+b]\times [\max \{0,\xi -b_F\},\xi +b_F]\), where \(b\ge 0\) determines the size of the neighborhood and \(b_F>0\) determines the size of the neighborhood in the frequency axis, by a chosen distribution concentration measure, denoted as \(\mathrm C_{\sigma ,b,b_F}(u,\xi )\).

  2. 2.

    The local optimal window width at the time instant u is determined by

    $$\begin{aligned} {\tilde{\sigma }}_{b,b_F}(u,\xi ):=\text {argmin}_{\sigma >0}\mathrm C_{\sigma ,b,b_F}(u,\xi ). \end{aligned}$$
    (15)
  3. 3.

    Apply the window width \({\tilde{\sigma }}_{b,b_F}(u,\xi )\) to evaluate the TFR of the signal x(t) at time u and frequency \(\xi \).

While the AOWW could provide a sharper TFR, compared with the TVOWW, the computational burden of the AOWW selection scheme is greatly increased. Furthermore, for a given time u, since the window width varies for different frequencies, the reconstruction formula (10) cannot be applied. We mention that the above algorithm can be easily generalized to select multiple window functions. Thereby, different windows can be taken into account in the optimization (14) or (15), so that the optimal window function and its corresponding optimal window width are selected. Since the multiple window selection is out of the scope of this work, we will study it in the future work.

3.2 Rényi entropy

The information entropy is a common measure to estimate the dispersion of an information content. By viewing the TFR at each time as a probability density function, a larger entropy indicates a less distributed concentration of the TFR. In this study, we adopt the Rényi entropy to measure the distribution concentration of a TFR [6].

The \(\alpha \)-Rényi entropy of a nonzero function p, where \(\alpha >0\), is defined as

$$\begin{aligned} {R}_{\alpha }(p):=\frac{1}{1-\alpha } \log _2 \left( \frac{\Vert p\Vert _{2\alpha }}{\Vert p\Vert _2}\right) ^{2\alpha }, \end{aligned}$$
(16)

where \(\Vert p\Vert _\alpha :=(\int |p(x)|^\alpha d x)^{1/\alpha }\) for \(0<\alpha <\infty \). Note that when \(\alpha <1, \Vert \cdot \Vert _\alpha \) is not a norm but a quasi-norm. It is well known that the larger the Rényi entropy is, the less concentrated the distribution is [32, 48]. That is to say, a window width providing the least Rényi entropy is regarded as the optimal window width. Note that when \(\alpha \rightarrow 0\), the Rényi entropy gives the \(\ell ^0\) norm information of the signal; when \(\alpha \rightarrow 1\), the Shannon entropy is recovered; and when \(\alpha \rightarrow 1/2\), we obtain the information of the commonly used ratio norm \(\ell ^1/\ell ^2\). In general, \(\alpha >2\) is recommended for TFR measures [48] and we chose \(\alpha =2.4\) in this study. In practice, we notice that the results are insensitive within a certain range of \(\alpha \) values (\(\alpha >0\)).

Denote the TFR of a chosen TF analysis P defined on \({\mathbb {R}}\times {\mathbb {R}}^+\). The TFR distribution is considered the most concentrated if its corresponding Rényi entropy is minimized. We thus define the measure of distribution concentration in the TVOWW selection scheme as

$$\begin{aligned} C_{\sigma ,b}(u):=\frac{1}{1-\alpha } \log _2\frac{\iint _{I_{u}} |R(t,\xi )|^{2\alpha }\mathrm {d}t\mathrm {d}\xi }{\big (\iint _{I_{u}} |R(t,\xi )|^2 \mathrm {d}t\mathrm {d}\xi \big )^\alpha }, \end{aligned}$$
(17)

where \(u\in {\mathbb {R}}\) and \(I_u:=[u-b,u+b]\times [0,\infty )\). Similarly, the distribution concentration measure in the AOWW selection scheme is defined as

$$\begin{aligned} C_{\sigma ,b,b_F}(u,\xi ) :=\frac{1}{1-\alpha } \log _2\frac{\iint _{J_{u,\xi }} |R(t,\xi )|^{2\alpha }\mathrm {d}t\mathrm {d}\xi }{\big (\iint _{J_{u,\xi }} |R(t,\xi )|^2 \mathrm {d}t\mathrm {d}\xi \big )^\alpha }, \end{aligned}$$
(18)

where \(u\in {\mathbb {R}}, \xi \in {\mathbb {R}}^+\), and \(J_{u,\xi }:=[u-b,u+b]\times [\max \{0,\xi -b_F\},\xi +b_F]\).

4 Results and discussions

We start the demonstration of the proposed the TVOWW and the AOWW selection schemes by analyzing a synthetic data. We then show the result of analyzing the laser-driven atomic dipole moment and discuss the performance of the proposed scheme. In this section, for the SST and the second-order SST, the numerical value of \(\kappa \) and \(\tau \) is selected to be small enough so that \(\frac{1}{\kappa }h(\frac{\cdot }{\kappa })\) and \(\frac{1}{\tau }h(\frac{\cdot }{\tau })\) are both implemented as discretized Dirac measures. The \(\gamma \) value is fixed at \(10^{-6}\%\) of the mean square energy of the signal x(t) under analysis.

4.1 Synthetic signal

Consider a multicomponent signal given by

$$\begin{aligned}&x(t) = x_1(t)+x_2(t)+x_3(t), \end{aligned}$$
(19)

where the signal components are:

$$\begin{aligned} x_1(t)&= \cos (2\pi \phi _1(t))\chi _{[-\infty , 20]}(t) \\ x_2(t)&= \cos (2\pi \phi _2(t))\chi _{[-\infty , 13.6]}(t) \\ x_3(t)&= \cos (2\pi \phi _3(t))\chi _{[17.5,\infty ]}(t), \end{aligned}$$

where \(\chi _I\) is the indicator function supported on \(I\subset {\mathbb {R}}\) and

$$\begin{aligned} \phi _1(t)&=1.33^{t-5}+3t \\ \phi _2(t)&=-0.0437(t-5)^4+0.5(t-5)^3+0.25(t-5)^2+5t \\ \phi _3(t)&=-\frac{2.7}{3.5}\cos {(3.5t)+0.85(t-15)^2+0.5t}. \end{aligned}$$

The corresponding IFs are \(\phi '_1(t)=({\ln 1.33})1.33^{t-5}+3, \phi '_2(t)=-0.175(t-5)^3+1.5(t-5)^2+0.5(t-5)+5\), and \(\phi '_3(t)=2.7{\sin 3.5t}+1.7(t-15)+0.5\). The observed signal \(Y(t)=x(t)+\lambda {\varPhi }(t)\), where \({\varPhi }\) is the white Gaussian noise with mean 0 and standard deviation (std) 1, and the \(\lambda \) value (\(\lambda >0\)) is chosen so that the signal-to-noise ratio (SNR), defined as \(20\log \frac{\text {std}(x(t))}{\lambda }\), is 15 dB. Y(t) is sampled at 60 Hz from the 0-th to the 25-th second (s). We select the optimal window width \(\sigma \) from a set of candidate bandwidths, \(\{11/720,31/720\ldots ,501/720\}\) s.

Fig. 1
figure 1

a TFR of the SST with the GOWW. b The TFR of the SST with the TVOWW. c The true IFs (blue \(x_1(t)\); magenta \(x_2(t)\); red \(x_3(t)\)) are superimposed on the TFR of the SST with the TVOWW. Note that the range of the color bar is increased for the comparison. d The spectral gap (upper panel) and the corresponding TVOWW (lower panel). It is clear that when the spectral gap is small, a longer window is needed. The TFR values are normalized by the z-score (color figure online)

4.1.1 TFR with the GOWW

We first show the limitation of using the GOWW selection scheme for the SST. In other words, we run the optimal window selection scheme with \(b=\infty \), resulting in the GOWW of 71 / 720 s. Fig. 1a demonstrates that the SST with the GOWW can capture the oscillatory dynamics. Nevertheless, while a small window width is required to reduce the Rényi entropy in the TFR, it results in the evident interference pattern between the neighboring IF components. For example, the spectral gap (differences between the adjacent IF components) at the 5-th s (i.e.,\(\phi _1'(5)\) and \(\phi _2'(5)\)) is 1.8 Hz, and a strong interference pattern is observed at the 5-th s. According to Definitions 3.1 and 3.2 in [22], the window width, measured by the full width at half maximum (FWHM), which is defined as \(2\sqrt{2\ln 2}{{{\tilde{\sigma }}}_b}\), should be at least \(1/1.8\approx 0.55\) s in order to separate the two neighboring components in the AHM. Here, the FWHM of the GOWW is 0.23 s, which is insufficient and leads to the interference pattern. Similar interference patterns can be observed at times 13.6 s, and 18.5 s, where spectral gaps are approximately 3 Hz and 7 Hz, respectively. It is clear that a larger spectral gap results in a less coupled interaction between the IF components. In summary, since the optimal window is chosen globally, the local details may not be refined even if the overall sharpness of the TFR is increased.

4.1.2 TFR with the TVOWW

We next demonstrate that the proposed TVOWW selection scheme can further improve the TFR quality. To reduce the computation capacity, we evaluate the local optimal window width every 0.25 s in a neighborhood with a width of \(2b=0.33\) s. The neighborhood size is found to be insensitive to the final result. While a small value is favorable, the width of the neighborhood should be greater than the sampling period [27]. Subsequently, a linear interpolation is applied to the samples of the TVOWWs such that there is an optimal window for each time instant in the signal interval. The TFR of the SST with the TVOWW is presented in Fig. 1b and its comparison with the true IFs is displayed in Fig. 1c. It is clearly shown that the coupling artifact between closing IF components is eliminated, particularly at the 5th s, as well as at the 13.6, and 18.5 s. The IF components in the TFR with an improved quality approaches the ideal IF components, as in Fig. 1c. We further display the corresponding TVOWW along with the spectral gap in Fig. 1d. According to this figure, the window widths become large at the closing times 5, 13.6, and 18.5 s to separate the different IF components. Note that at time 5 s, the largest window width is \({{{\tilde{\sigma }}}_b}=\frac{345}{720}=0.48\) s, corresponding to a FWHM of 1.08 s, which is larger than 0.55 s.

Fig. 2
figure 2

A large window is needed to separate the two adjacent IF components for different TF analyses. The window width in the upper panel is \(\frac{111}{720}\) s (small) and that in the lower panel is \(\frac{345}{720}\) s (large). The TFRs of the SST, the vSST, the oSST, and the RM are shown in (a, e), (b, f), (c, g), and (d, h), respectively. The TFR values are normalized by the z-score. It is clear that while the TFRs of the second-order SST and the RM are sharpened, with the small window width these TFRs suffer from the “coupling artifact” caused by the two closing IF components. A longer window width in this case can help remove the artifact. We could see that the SST could not well handle the fast-varying IF

4.1.3 Necessity of selecting a proper window width

In this subsection, we accentuate that while the second-order SST and the RM could provide a sharper TFR compared with the SST, the impact of the window width is not negligible. We demonstrate the TFR of the synthetic signal (19) analyzed by the second-order SST and the RM in Figs. 2 and 3. In both figures, no noise is involved. While the second-order SST and the RM can mitigate the limitation of the SST caused by the fast-varying IF components, without a proper choice of the window width, the second-order SST and the RM could fail.

Figure 2 shows that a large window width is required to separate the two components with closing IFs. The (a–d) in Fig. 2 are TFRs using a small window width \(\frac{111}{720}\) s, which is the GOWW of the vSST. The coupling artifact between the two components caused by the small width for the all transforms is evident. As mentioned in previous sections, the coupling artifact can be greatly diminished by increasing the window width. In (d–h) in Fig. 2, we choose the window width as \(\frac{345}{720}\) s, which is the largest TVOWW for the SST. For all TFRs, the two IF components are clearly separated, particularly in the RM result Fig. 2h.

Figure  3 shows that a small window width is required to capture the variation in an oscillating IF component. Note that the small window width is \(\frac{111}{720}\) s and the large window width is \(\frac{251}{720}\) s. A small window width provides a fine temporal resolution, which allows us to extract the dynamical information of an IF component (a–d in Fig. 3), while a large window width causes ambiguity in temporal direction (e–h in Fig. 3).

In summary, a proper window width is a prerequisite to obtain the accurate IF information in the TFR in spite of the fact that the conventional reassignment rule in the RM and high-order reassignment rules in the second-order SST can cope with the fast-varying IF components efficiently.

Fig. 3
figure 3

A small window width is needed to capture the variation in the oscillatory IF component. The window width in the upper panel is \(\frac{111}{720}\) s (small) and that in the lower panel is \(\frac{251}{720}\) s (large). The TFRs of the SST, the vSST, the oSST, and the RM are shown in (a, e), (b, f), (c, g), and (d, h), respectively. The TFR values are normalized by the z-score. It is clear that while the TFRs of the second-order SST and the RM are sharpened, with the large window width these TFRs are “confused” by the fast-varying IFs. A shorter window width in this case can help increase the TFR quality. We could see that the SST could not well handle the fast-varying IF

4.1.4 Reconstruction error analysis

Finally, to quantify the improvement of the TFR by taking the TVOWW into account, we evaluate the normalized root-mean-square deviation (RMSD) by comparing the reconstructed signal components, the IF, and the AM with corresponding true answers.

The normalized RMSD for the evaluation component \({\hat{f}}_i\), where \(i=1, 2\), and 3, is given as

$$\begin{aligned} {\hbox {Normalized RMSD}}\left( {\hat{f}}_i\right) =\frac{{\Vert {\hat{f}}_i^2(t) - f_i^2(t)\Vert _{L^2}}}{|f_{i,\mathrm{max}}-f_{i,\mathrm{min}}|}, \end{aligned}$$
(20)

where \(f_{i,\mathrm max}\) and \(f_{i,\mathrm min}\) are the maximum and minimum values of \(f_i\), respectively. Here \({\hat{f}}_i\) can represent the reconstructed signal component \({\hat{x}}_i\), the reconstructed IF, and the reconstructed AM from the TFR.

In addition to the noiseless condition, we compute the normalized RMSD for SNR of 15 and 10 dB for 25 trials and report the mean and the standard deviation of the normalized RMSD.

The results for the reconstruction performance, the reconstructed IF, and the reconstructed AM for each component are presented in Figs. 4,  5, and  6, respectively. The IF components are estimated by evaluating the center of mass of the TFR. The AM components are extracted from the envelope of the reconstructed signal components. Note that the error varies for different methods to compute the AM components.

Fig. 4
figure 4

Normalized RMSD estimated by comparing the true answer \(x_i(t)\) with the reconstructed signal \({{{\hat{x}}}}_i(t)\) from TFRs with the GOWW or the TVOWW for \(i=1,2,\) and 3. Noisy cases are considered for the SNR of 15 and 10 dB

Fig. 5
figure 5

Normalized RMSD estimated by comparing the true instantaneous frequency with the reconstructed instantaneous frequency from TFRs with the GOWW or the TVOWW for each IMT. Noisy cases are considered for the SNR of 15 and 10 dB

Fig. 6
figure 6

Normalized RMSD estimated by comparing the true amplitude modulation with the reconstructed amplitude modulation from TFRs with the GOWW or the TVOWW for each IMT. Noisy cases are considered for the SNR of 15 and 10 dB

The results confirm the benefit of the TVOWW selection scheme, particularly for the components \(x_1(t)\) and \(x_2(t)\). For \(x_3(t)\), the errors of the GOWW and the TVOWW selection schemes are similar, since this signal component is less coupled with the others.

The results for the reconstructed IF and the reconstructed AM for each component are presented in Figs. 5 and  6, respectively. The IF components are estimated by evaluating the center of mass of the TFR. The AM components are extracted from the envelope of the reconstructed signal component. Note that the error varies for different methods to compute the AM components.

Although not shown in the paper due to the page limit, we mention that the TVOWW selection technique can be applied to the second order SST and other variations of the SST to improve the reconstruction quality.

4.1.5 Toward an optimally concentrated TFR—AOWW

We demonstrate that the AOWW selection scheme can achieve a more concentrated TFR by considering the optimal window width in both time and frequency axes. For the example of the synthetic signal, we set \(2b_F=1.6\) Hz and evaluate the local optimal window width every 1.4 Hz. The TFR results for the SST and the vSST using the AOWW selection scheme are presented in Fig. 7. By comparing Fig.7a with 1b, we found that the TFR is sharpened using the AOWW selection scheme, at the expense of significantly increased computation and the lost of the inverse routine to reconstruct each IMT component. Moreover, the TFR of the vSST with the AOWW (Fig.7b) and that with the TVOWW is similar.

Fig. 7
figure 7

TFRs of the SST and the vSST using the AOWW

4.1.6 The influence of parameters in the GOWW, the TVOWW and the AOWW selection schemes

We mention that the optimal \(\alpha \) value chosen for the Renyi entropy might depend on the application. For a specific application, we could further optimize \(\alpha \), and it might depend on parameters such as the sampling rate, frequency-axis and time-axis discretization, and the parameters b and \(b_F\).

In this subsection, we show that the GOWW, the TVOWW and the AOWW selection schemes are not sensitive to these parameters. Table 1 presents the normalized RMSD of the reconstructed components for three different \(\alpha \), which are 2, 2.4, and 2.8. Here b and \(b_F\) are fixed. Table 2 presents the normalized RMSD of the reconstructed components for three different b, which are 0.8 s, 0.17 s, and 0.25 s. Here \(\alpha \) and \(b_F\) are fixed. Since no reconstruction routine is available for the AOWW, the evaluation of the dependence of the AOWW on the chosen parameters is based on the deviation of the IF components via ridge extraction. Table 3 presents the normalized RMSD of the reconstructed IF components for three different \(b_F\), which are 0.6 Hz, 0.8 Hz, and 1 Hz. Here \(\alpha \) and b are fixed. Here the local optimal window width is evaluated every 1 Hz. These results provide the evidence that the GOWW, the TVOWW, and the AOWW selection schemes are stable to three major parameters \(\alpha , b\), and \(b_F\).

Table 1 Component reconstruction errors for three different values of \(\alpha \) in the GOWW and the TVOWW selection schemes
Table 2 Component reconstruction errors for three different values of b in the TVOWW selection scheme
Table 3 Instantaneous frequency reconstruction error with three different values of \(b_F\) in the AOWW selection scheme

4.2 Application to attosecond physics

During the past decade, real-time observation and direct control of electronic motion in atoms, molecules, nanostructures and solids have been achieved due to advent in the synthesis of attosecond pulses [34]. In general, an isolated attosecond pulse is created by the superposition of a broadband supercontinuum in high-order harmonic generation driven by high-intensity femtosecond laser pulses [13]. To date, an isolated attosecond pulse as short as 67 attoseconds has been reported [55]. To synthesize shorter attosecond pulses, a better understanding of the underlying physical mechanism is needed. The physical mechanism of the synthesis of attosecond pulses can be understood by analyzing the electron dipole moment oscillation induced by an applied laser field via the TF analysis. In the previous literature [1, 2, 14, 35, 39, 42, 47, 50, 51], the linear-type transforms based on short window widths have been adopted and the results are consistent with the classical trajectory simulations [21]. However, there is no discussion on how and why small windows are chosen in the field of attosecond physics.

Fig. 8
figure 8

TFR of the SST with the TVOWW for the electron dipole moment in an acceleration form. Note that the TFR is in the logarithmic scale. The second emission that reaches up to the 900th harmonic order in a very short interval can be utilized to synthesize an isolated ultrashort attosecond pulse

Fig. 9
figure 9

Enlarged figures from Fig. 8 show delicate differences between the TFR with the GOWW (a) and the TVOWW (b). The blue arrow indicates the branch corresponding to the long trajectory quantum path, and the red arrow indicates the branch corresponding to the short trajectory quantum path (color figure online)

Fig. 10
figure 10

a TFR of the synchrosqueezed Morlet wavelet transform of the acceleration dipole moment using a laser field of a wavelength of 800 nm and an intensity of \(5\times 10^{13}\;\text{ W/cm }^2\). b The TFR of the synchrosqueezed Morlet wavelet transform with the AOWW selection scheme applied

To clarify this issue, we study the electron dipole moment in atomic hydrogen evoked by an optimally shaped laser waveform that can generate an isolated 21 attosecond pulse [15]. Such a laser profile can greatly extend the high-order harmonics up to 900 harmonics within a short time interval, suggesting fast-varying IF components. The time-dependent dipole moment in the acceleration form is computed by solving a three-dimensional time-dependent Schrödinger equation in the framework of the time-dependent generalized pseudospectral (TDGPS) method within the electric dipole approximation [49]. The TDGPS method gives accurate orbital energies and has been employed in the strong field physics as well as attosecond science. The simulation details are referred to [15].

We then compute the Rényi entropy for a series of window widths, ranging from 0.25 atomic units (a.u.) to 8.33 a.u. We apply the TVOWW selection scheme with a neighborhood size of \(2b=10\) a.u., resulting in Fig. 8. Figure 8 indicates that there are three emissions taking place. The cutoffs of the first and third emissions are located at around the 500th order, and the second emission reaches the 900th order. The branches on the TFR nearly coincide with the classical trajectories reported in the previous literature [15]. For comparison purposes, enlarged details of the second emission with the GOWW and the TVOWW are displayed in Fig. 9. It is observed that at around 0.43 laser cycles (1 laser cycle \(=275.77\) a.u.), the branch indicated by the blue arrow in Fig. 9b corresponding to the long trajectory quantum path has the strongest intensity and consists of the most harmonics. While the branch indicated by the red arrow dies out after 0.35 laser cycles in the TFR of the SST with the GOWW (Fig. 9a), it is revealed by the result with the TVOWW (Fig. 9b) that the short trajectory quantum path also has an influence on the high-order harmonic emission. These high-order harmonics occur almost simultaneously, which is a prerequisite of a dependable attosecond pulse.

In the second example, we demonstrate that the AOWW selection scheme is beneficial to distinguish the near-threshold harmonics in the TF representation of a hydrogen atom in the strong laser field. Figure 10 shows the TF representations for HHG generated by a monochromatic laser field with a wavelength of 800 nm and an intensity of \(5\times 10^{13}\;\text{ W/cm }^2\). The laser field profile is described by \(\sin ^2 (\pi t/(nT))\), where \(n=40\) is the pulse length measured in optical cycles (\(T=2\pi /\omega _0\)), and \(\omega _0\) is fundamental angular frequency of the laser wavelength. (The definition of the laser field profile and the simulation details can be found in [46, 47].) The laser parameters correspond to the Keldysh parameter \(\gamma _K=1.51\) [10, 47], indicating that the main dynamic mechanism is the multiphoton ionization process. Generally speaking, \(\gamma _K\gg 1\) and \(\gamma _K\ll 1\) correspond to the multiphoton ionization regime and the tunneling ionization regime, respectively. Figure 10a presents the result of the synchrosqueezed Morlet wavelet transform with a scaling parameter \(\tau =6\), and in Fig. 10b, the AOWW selection scheme is applied. Due to the advantage of multiresolution, the synchrosqueezed Morlet wavelet transform [35, 47] can clearly describe the below-threshold harmonics (from the 1st to the 5th harmonics), and the chirp-like dynamics in the above-threshold region. However, in the near-threshold region (The ionization threshold in this case is the 8.78th harmonic.), the harmonics (i.e., from the 7th to the 11th harmonics) are coupled and ambiguous. After applying the AOWW selection scheme with a neighborhood size of \(2b=0.24 T\) and \(2b_F=0.46 \omega _0\), the near-threshold harmonics in Fig. 10b are clearly depicted in the TFR. The second example indicates that the AOWW selection scheme may be applied to other atomic systems such as the Cs atom [35].

4.3 A comparison with other methods

The proposed TVOWW and the AOWW selection schemes have similarities with some non-reassignment-type TF analysis methods. For example, in the sparsification approach [29], when the signal satisfies the regularity conditions of the AHM, a dictionary design and a sparsity-based optimization lead to the desired time-varying spectral information and signal decomposition. However, it is not clear how to achieve the optimal dictionary design, and the optimization step in the sparsification could be compute-intensive if the dictionary is chosen improperly. To have a parallel comparison with the reassignment-type transforms, note that the dictionary in the reassignment-type transforms, for example, \({\mathcal {D}}\) in (12), is infinitely redundant. The “optimal” frame is not chosen by any direct optimization procedure. Instead, the reassignment rule provides an approximation of the optimal frames. When combined with the TVOWW or the AOWW selection scheme, we get the optimal frame over an infinitely redundant dictionary. In this sense, when combined with the TVOWW or the AOWW, the reassignment-type transforms could be viewed as a variation of the sparsification approach.

The Tycoon [33], on the other hand, could be viewed as a TF analysis technique based on the convex optimization from the synthesis viewpoint [4]. In this approach, we do not design a dictionary or choose a window. Instead, we need to determine some fundamental quantities that a “good” TFR should satisfy and then directly find this good TFR by optimizing a functional capturing the determined fundamental quantities. Since the TFR determined by the SST could approximate the considered functional [22, 33], the combination of the TVOWW or the AOWW and the SST and its variations could be viewed as a relaxation of the Tycoon.

In the TFJP [32], we first fix a TF plane tiling. For each block in the TF plane tiling, the optimal window for the GT is then selected based on the Rényi entropy. While it leads to a sharper TFR, the “uncertainty” still exists. Furthermore, since the TF plane tiling is not uniformly distributed, the signal decomposition ability is limited. While the SST and its variations combined with the TVOWW or the AOWW selection schemes could be viewed as a variation of the TFJP in the sense of “window selection,” we mention that the TFJP and frame-based methods are different in essence. Specifically, TFJP is not specifically designed for sums of frequency modulated signals but for a more general signal, so the application fields of the TFJP are different.

5 Conclusions

In this study, we propose two optimal window width selection schemes, namely the TVOWW and the AOWW selection techniques, to optimize the concentration of the TFR determined by a chosen TF analysis. The Rényi entropy is applied to determine the concentration of the TFR. In addition to showing the performance of the proposed scheme in a synthetic signal, we show potential applications of this method to attosecond physics. We believe that this work can serve as a cornerstone in ultrafast dynamics in atoms and molecules to uncover new physics.