1 Introduction

Continuous monitoring of physiological functions such as swallowing can pose severe constraints on data acquisition and processing systems. Even when sampling physiological signals at low rates (e.g., 250 Hz), we end up with close to a million of samples in the first hour of monitoring. Similar computational burdens are ever-present in telemedicine, and in recent years we have witnessed numerous efforts to deal with this problem. One such effort is to compress the acquired signals immediately upon sampling using various schema (e.g. [1]). The other is to rethink the way we acquire the data, and a number of recent publications have begun looking at this approach (e.g., [25]).

The idea of compressive sensing (CS) has gained considerable attention in recent years. The main idea behind CS is to diminish the number of steps involved when acquiring data by combining sampling and compression into a single step [3, 4]. Specifically, CS enables one to acquire the data at sub-Nyquist rates, and recover it accurately from such sparse samples [3].

In this article, we propose an approach for CS of swallowing accelerometry signals based on a time-frequency dictionary. In particular, the members of the dictionary are recently proposed modulated discrete spheroidal sequences (MDPSS) [6, 7]. The bases within the time-frequency dictionary are obtained by modulation and variation of the bandwidth of discrete prolate spheroidal sequences (DPSS) to reflect the vaying time-frequency nature of many biomedical signals, including the swallowing acclerometry signals considered in this article. Using the proposed approach, we carry out a numerical analysis of synthetic test signals and real swallowing accelerometry signals. The numerical analysis using the synthetic test signals showed that the CS approach based on MDPSS was more accurate than the CS approach based on DPSS (e.g., [7, 8]). Additionally, the analysis of swallowing accelerometry signals showed that we can obtain 90% cross-correlation between the reconstructed signals and the actual signals using only 50% percent of samples. This has been observed for three different types of swallowing tasks.

The article is organized as follows: Section 2 describes swallowing accelerometry and outlines the advantages of this approach for detecting swallowing difficulties. In Section 3, we describe the proposed approach for CS using the time-frequency based dictionary consisting of MDPSS bases. Section 4 reports the data analysis steps that we carried out to obtain the reported results, which are presented in Section 5 along with the discussion of the same results. The conclusions are drawn in Section 6.

2 Swallowing accelerometry

Swallowing (deglutition) is a complex process of transporting food or liquid from the mouth to the stomach consisting of four phases: oral preparatory, oral, pharyngeal, and esophageal [9]. Dysphagic patients (i.e., patients suffering from swallowing difficulty) usually deviate from the well-defined pattern of healthy swallowing. Dysphagia frequently develops in stroke patients, head injured patients, and patients with others with paralyzing neurological diseases [10]. Patients with dysphagia are prone to choking and aspiration (the entry of material into the airway below the true vocal folds) [9]. Aspiration and dysphagia may lead to serious health sequelae including malnutrition and dehydration [11, 12], degradation in psychosocial well-being [13, 14], aspiration pneumonia [15], and even death [16].

The videofluoroscopic swallowing study (VFSS) is used widely in today's dysphagia management and it represent the gold standard for assessment [9, 17]. However, VFSS requires expensive X-ray equipment as well as expertise from speech-language pathologists and radiologists. Hence, only a limited number of institutions can offer VFSS and the procedure has been associated with long waiting lists [18, 19]. In addition, day-to-day monitoring of dysphagia is crucial due to the fact that the severity of dysphagia can fluctuate over time and VFSS is not suitable for such day-to-day monitoring.

Cervical auscultation is a promising non-invasive tool for the assessment of swallowing disorders [20] involving the examination of swallowing signals acquired via a stethoscope or other acoustic and/or vibration sensors during deglutition [21]. Swallowing accelerometry is one such approach and employs an accelerometer as a sensor during cervical auscultation. Swallowing accelerometry has been used to detect aspiration in several studies, which have described a shared pattern among healthy swallow signals, and verified that this pattern is either absent, delayed or aberrant in dysphagic swallow signals [2234].

However, these previous studies used single-axis accelerometers and exclusively monitored vibrations propagated in the anterior-posterior direction at the cervical region. Proper hyolaryngeal movement with precise timing during bolus transit is vital for airway protection in swallowing [9]. Since the motion of the hyolaryngeal structure during swallowing occurs in both anterior-posterior (A-P) and superior-inferior (S-I) directions, the employment of dual-axis accelerometry seems well motivated. Since correlation has been reported between the extent of laryngeal elevation and the magnitude of the A-P swallowing accelerometry signal [35], it is hypothesized that vibrations in the S-I axis also capture useful information about laryngeal elevation. From a physiological stand point, the S-I axis appears to be as worthy of investigation as the A-P axis because the maximum excursion of the the hyolaryngeal structure during swallowing is of similar magnitude in both the anterior and superior directions [36, 37]. Recent contributions have indeed confirmed that dual-axis accelerometers yield more information and enhance analysis capabilities [3843].

2.1 Data

Sample signals used in this article were collected from 408 participants (ages 18-65) over a 3 month period from a public science centre in Toronto, Ontario, Canada. All participants provided written consent and had no documented swallowing disorders. The research ethics boards of the Toronto Rehabilitation Institute and Holland Bloorview Kids Rehabilitation Hospital (both located in Toronto, Ontario, Canada) approved the study protocol.

To collect data from participants, we used a dual-axis accelerometer (ADXL322, Analog Devices), which was attached to the participant's neck (anterior to the cricoid cartilage) using double-sided tape. The axes of acceleration were aligned to the anterior-posterior and superior-inferior directions. Data were band-pass filtered in hardware with a pass band of 0.1-3,000 Hz and sampled at 10 kHz using a custom LabVIEW program running on a laptop computer. With the accelerometer attached, each participant was cued to perform five saliva swallows (i.e., dry swallows), five water swallows by cup with their chin perpendicular to the floor (i.e., wet swallows) and five water swallows in the chin-tucked position. The entire data collection session lasted 15 min per participant.

3 Proposed scheme

Traditional signal processing approaches for sensing and processing of information have relied on the Shannon sampling theorem, which states that a bandlimited signal x(t) can be reconstructed from uniform samples {x(kT s )}:

x ( t ) = k x ( k T s ) sin ( Ω max ( t - k T s ) / π ) Ω max ( t - k T s ) / π
(1)

where T s is the sampling period and Ωmax represents the maximum frequency present in the signal. In other words, the Shannon sampling theorem states that in order to ensure accurate representation and reconstruction of a signal with Ωmax, we should sample it at least at 2Ωmax samples per second (i.e., the Nyquist rate). However, many recent publications have challenged this approach for a number of reasons (e.g., [44, 45]). First, by using the Shannon sampling theorem we rely on bases of infinite support, while we generally reconstruct signal samples in the finite domain [44]. Second, large bandwidth values can severely constraint sampling architectures [45]. Third, even when we consider signals with a relatively low band-width values such as swallowing accelerometry signals, continuous monitoring of swallowing function can produce large number of redundant samples, which severely constraints our processing efforts.

A recently proposed idea of CS resolves some of the aforementioned issues [35]. CS is a method closely related to transform coding, since a transform code converts input signals, embedded in a high-dimensional space, into signals that lie in a space of significantly smaller dimensions (e.g., wavelet and Fourier transforms) [4]. CS approaches are particularly suited for K-sparse signals, i.e., signals that can be represented by significant K coefficients over an N-dimensional basis. Encoding of a K-sparse, discrete-time signal of dimension N is accomplished by computing a measurement vector y that consists of M << N linear projections of the vector x. This can be compactly described via

y = Φ x
(2)

where Φ represents an M × N matrix and is often refer to as the sensing matrix [4]. A natural formulation of the recovery problem is within an norm minimization framework, which seeks a solution to the problem

min x 0 subject to y - Φ x 2 < η
(3)

where η is the expected noise of measurements, ||x||0 counts the number of nonzero entries of x and || • ||2 is the Euclidian norm. Unfortunately, the above minimization is not suitable for many applications as it is NP-hard [46]. To avoid the computational burden, approaches like thresholding, (orthogonal) matching pursuit and basis pursuits have been proposed [46]. In this article, we will focus on the matching pursuit [47].

Given the CS framework, the immediate question is how to define the sensing matrix Φ, that is the bases used in the recovery of the signal. Most commonly used sensing matrices are random matrices with independent identically distributed (i.i.d.) entries formed by sampling either a Gaussian distribution or a symmetric Bernoulli distribution [48]. Previous publications have shown that these matrices can recover the signal with high probability [48]. However, when dealing with biomedical signals, we would like to "precisely" recover the signals (i.e., with a very small error). Therefore, we propose to use a time-frequency dictionary (also known as frames [49]) based on modulated discrete prolate spheroidal sequences (MDPSS).

3.1 Time-frequency dictionaries based on MDPSS

To understand MDPSS, let's begin with a general description of discrete prolate spheroidal sequences (DPSS). Given N such that n = 0, 1, . . . , N −1 and the normalized half-bandwidth, W such that 0 < W < 0.5, the k th DPSS, v k (n, N, W ), is defined as the real solution to the system of equations [50]:

m = 0 N - 1 sin [ 2 π W ( n - m ) ] π ( n - m ) v k ( m , N , W ) = λ k ( N , W ) v k ( n , N , W ) k = 0 , 1 , , N - 1
(4)

with λ k (N, W ) being the ordered non-zero eigenvalues of (4)

λ 0 ( N , W ) > λ 1 ( N , W ) , , λ N - 1 ( N , W ) > 0 .
(5)

Slepian showed that behaviour of these eigenvalues for fixed k and large N is given by

1 - λ k ( N , W ) ~ π k ! 2 14 k + 9 4 α 2 k + 1 4 [ 2 - α ] - ( k + 0 . 5 ) N k + 0 . 5 e - γ N
(6)

where

α = 1 - cos ( 2 π W ) γ = log 1 + 2 ( α ) 2 - α

The first 2NW eigenvalues are very close to 1 while the rest rapidly decays to zero [50]. Interestingly enough, it has been observed that these quantities are also the eigenvalues of N × N matrix C(m, n) [50], where the elements of such a matrix are

C ( m , n ) = sin [ 2 π W ( n - m ) ] π ( n - m ) m , n = 0 , 1 , , N - 1
(7)

and the vector obtained by time-limiting the DPSS, v k (n, N, W ), is an eigenvector of C(m, n). The DPSS are doubly orthogonal, that is, they are orthogonal on the infinite set {−∞, . . . , } and orthonormal on the finite set {0, 1, . . . , N − 1}, that is,

- v i ( n , N , W ) v j ( n , N , W ) = λ i δ i j
(8)
n = 0 N - 1 v i ( n , N , W ) v j ( n , N , W ) = δ i j
(9)

where i, j = 0, 1, . . . , N − 1. The sequences also obey symmetry laws

v k ( n , N , W ) = ( - 1 ) k v k ( N - 1 - n , N , W )
(10)
v k ( n , N , W ) = ( - 1 ) k v N - 1 - k ( N - 1 - n , N , 1 / 2 - W )
(11)

where n = 0, ± 1, ± 2, . . . and k = 0, 1, . . . , N − 1.

If these DPSS are used for signal representation, then usually accurate and sparse representations are obtained when both the DPSS and the signal under investigation occupy the same band (e.g., [6, 51]). However, problems arise when the signal is centered around some frequency o | > 0 and occupies bandwidth smaller than 2W. In such situations, a larger number of DPSS is required to approximate the signal with the same accuracy despite the fact that narrowband signals are more predictable then wider band signals [7, 52]. In order to find a better basis, MDPSS were proposed in [6, 7]. MDPSS are defined as

M k ( N , W , ω m ; n ) = exp ( j ω m n ) v k ( N , W ; n )
(12)

where ω m = 2πf m is a modulating frequency. It is easy to see that MDPSS are also doubly orthogonal, obey the same Equation (4) and are bandlimited to the frequency band [−W + ω m : W + ω m ].

The next question which needs to be answered is how to choose a proper modulation frequency ω m . In the simplest case when the spectrum S(ω) of the signal is confined to a known band [ω1; ω2], i.e.,

S ( ω )  =  0 ω [ ω 1 , ω 2 ] and | ω 1 | < | ω 2 | 0 elsewhere
(13)

then the modulating frequency, ω m , and the bandwidth of the DPSSs are naturally defined by

ω m = ω 1 + ω 2 2
(14)
W = ω 2 - ω 1 2
(15)

as long as both satisfy:

ω m + W < 1 2 .
(16)

However, in practical applications, exact frequency band is known only with a certain degree of accuracy and usually evolves in time. Therefore, only some relatively wide frequency band is expected to be known. In such situations, an approach based on one-band-fits-all may not produce a sparse and accurate approximation of the signal. In order to resolve this problem it was suggested to use a band of bases with different widths to account for time-varying bandwidths [53]. However, such representation once again ignores the fact that the actual signal bandwidth could be much less then 2W dictated by the bandwidth of the DPSS. In order to provide further robustness to the estimation problem we suggest to use of a time-frequency dictionary containing bases which reflect various bandwidth scenarios.

To construct this time-frequency dictionary, it is assumed that an estimate of the maximum frequency is available. The first few bases in the dictionary are the actual traditional DPSS with bandwidth W. Additional bases could be constructed by partitioning the band [−ω; ω] into K subbands with the boundaries of each subband given by [ω k ; ωk+1], where 0 ≤ kK − 1, ωk+1> ω k , and ω0 = −ω, ωK-1= ω. Hence, each set of MDPSS has a bandwidth equal to ωk+1ω k and a modulation frequency equal to ω m = 0.5(ω k + ωk+1).

Obviously, a set of such function again forms a basis of functions limited to the bandwidth [−ω; ω]. While particular partition is arbitrary for every level K ≥ 1, we can chose to partition the bandwidth in any desired way as shown in Figure 1. In this article, we partition the bandwidth in equal blocks, as shown in Figure 1d, to reduce amount of stored pre-computed DPSS. In general, finding the best partitioning approach would be based on a priori knowledge about the phenomenon under investigation. Unless such knowledge is available, there is no strong reason for us to believe that non-uniform approaches shown in Figures 1a-c would yield a better performance than the uniform partitioning scheme shown in Figure 1d without extensive optimization procedures. However, such investigations are beyond the scope of this manuscript.

Figure 1
figure 1

Different approaches to form a 4-band time-frequency dictionary based on MDPSS. (a)-(c) demonstrate non-uniform blocks, while (d) represents a uniform-block division.

3.2 Matching pursuit and MDPSS-based frames

As mentioned at the beginning of Section 3, the CS approaches can be NP-hard, which are not practically viable. Fortunately, efficient algorithms, known generically as matching pursuit [47, 49], can be used to avoid some of the computational burden associated with the CS. The main feature of the algorithm is that when stopped after a few steps, it yields an approximation using only a few basis functions [47]. The matching pursuit decomposes any signal into a linear expansion of waveforms that are selected from a redundant dictionary of functions [47]. It is a general, greedy, sparse function approximation scheme with the squared error loss, which iteratively adds new functions (i.e. basis functions) to the linear expansion. In comparison to a basis pursuit it significantly reduces the computational complexity, since the basis pursuit minimizes a global cost function over all bases present in the dictionary [47]. If the dictionary is orthogonal the method works perfectly. Also, to achieve compact representation of the signal, it is necessary that the atoms are representative of the signal behaviour and that the appropriate atoms from the dictionary are chosen.

The algorithm for the matching pursuit starts with initial approximation for the signal, x ^ , and the residual, R:

x ^ ( 0 ) ( m ) = 0
(17)
R ( 0 ) ( m ) = x ( m )
(18)

where m represent the M time indices that are uniformly or non-uniformly distributed. Then, the matching pursuit builds up a sequence of sparse approximation by reducing the norm of the residue, R = x ^ - x . At stage k, it identifies the dictionary atom that best correlates with the residual and then adds to the current approximation a scalar multiple of that atom, such that

x ^ ( k ) ( m ) = x ^ ( k - 1 ) ( m ) + α k ϕ k ( m )
(19)
R ( k ) ( m ) = x ( m ) - x ^ ( k ) ( m )
(20)

where α k = ( R ( k - 1 ) ( m ) , ϕ k ( m ) ) /|| ϕ k ( m ) | | 2 . The process continues till the norm of the residual R(k)(m) does not exceed required margin of error ε > 0: ||R(k)(m)|| ≤ ε[47].

Here we can consider two stopping approaches. One is based on the idea that the normalized mean square error should be below a certain threshold value, γ:

| | x - x ^ ( k ) | | 2 2 | | x | | 2 2 γ
(21)

An alternative stopping rule can mandate that the number of bases, n B , needed for signal approximation should satisfy n B K. In previous contributions (e.g., [6]), is set equal to ⌈2NW ⌉ + 1 to compare the performance of the MDPSS-based frames with DPSS.

In either case, a matching pursuit approximates the signal using L bases as

x ( n ) = l = 1 L x ( m ) , ϕ l ( m ) ϕ l ( n ) + R ( L ) ( n )
(22)

where ϕ l are L bases from the dictionary with the strongest contributions.

3.3 Estimation of sampling times

Based on the definition of MDPSS, we are expected to know when the sampling times occur in order to use a proper value of the basis function. However, this assumption is typically not realized and we need to estimate the time location. Therefore, let us assume that the signal

x ( t ) = m = 0 M - 1 x ( t ^ m ) δ ( t - t ^ m ) + n ( t )
(23)

is a superposition of M delta functions with additive noise n(t) resulting from the nonuniform sampling. To estimate t ^ m let us first consider the period extension of the signal:

x ( t ) = k = - X k e j k Ω o t + n ( t )
(24)

where Ω o = 2π/T and the Fourier coefficients are given by

X k = m = 0 M - 1 x ( t ^ m ) e - j k Ω o t ^ m = m = 0 M - 1 x ( t ^ m ) u m k - ( M - 1 ) k ( M - 1 )
(25)

where u m = e - j Ω o t ^ m . The problem is them to find the parameters t ^ m that satisfy the above equation from the noisy nonuniform samples, which can be achieved using the annihilating filter [2, 44, 54]. In particular, if the transfer function of the annihilating filter is defined as

A ( z ) = m = 0 M - 1 ( 1 - u m z - 1 ) = m = 0 M - 1 α m z - m
(26)

then by filtering both sides of Equation (25) using the filter, we get

m = 0 M - 1 α m X k - m = m = 0 M - 1 n = 0 N - 1 x ( t ^ n ) u n k - m α m = m = 0 M - 1 x ( t ^ n ) n = 0 N - 1 u n - m α m u n k
(27)

where the last term is due to u n being a root of A(z). Then, A(z) can be obtained by solving Equation (27) for {α m } (i.e., set Equation (27) equal to zero and solve for filter coefficients). Using the roots of A(z), u m = e - j Ω o t ^ m / T , the nonuniform sampling time are estimated by

t ^ m = - T 2 π j log u m m = 0 , , M - 1
(28)

A thorough description of the procedure can be found in [2, Appendices 1 and 2].

4 Data analysis

Our data analysis consists of two parts. In the first part, we consider the synthetic test signals in order to examine the accuracy of the scheme in well-known conditions. In the second part, we use dual-axis swallowing accelerometry signals to examine how accurately we can recover these signals from sparse samples. In both cases, we will follow the procedure shown in Figure 2.

Figure 2
figure 2

A flow chart for the proposed algorithm.

4.1 Synthetic test signals

To analyze the proposed scheme, we assumed the following test signal:

x ( n ) = i = 1 10 A i sin ( 2 π f i n T s ) + σ ζ ( n )
(29)

where 0 ≤ n < N, T s = 1/256, N = 256, A i is uniformly drawn from random values in 0[2] and f i ~ N(30, 102). ζ(n) represents white Gaussian noise and σ is its standard deviation.

The first experiment consists of maintaining 150 samples equally spaced throughout the signal. The SNR values are varied between 0 and 30 dB in 1-dB increments, while the normalized half-bandwidth W is altered between 0.300 and 0.375 in 0.025 increments. We compared the accuracy of the proposed approach using 7- and 15-band MDPSS-based dictionaries against the CS approach based on DPSS. The accuracy was compared by evaluating the normalized mean square error:

MSE = | | x ( n ) - x ^ ( n ) | | 2 2 | | x ( n ) | | 2 2
(30)

where x(n) is a realization of the signal defined by Equation (29) and x ^ ( n ) represents a recovered signal. The MSE values were obtained using 1,000 realizations. To calculate the recovered signal using the DPSS, we used the following formula

x ^ DPSS ( n ) = U ( n , k ) U ( m , k ) T U ( m , k ) U ( m , k ) T x ( m )
(31)

where A denotes the pseudo-inverse of a matrix; U(n, k) is the matrix containing K bases (i.e., DPSS) and each sequence is of length N; m denotes the time instances at which the samples are available.

In the second experiment, we vary the number of available samples from 50 samples to 200 samples in increments of 10 samples in order to understand how the number of samples affects the overall accuracy of the proposed scheme. The samples are uniformly distributed, and the normalized half-bandwidth is set to 0.30. The lower boundary of 50 samples denotes a very aggressive scheme, as it represents approximately 20% of the original samples. On the other hand, the upper boundary of 200 samples represents a very lenient scheme for compressive sampling since it represents approximately 78% of the original samples. Additionally, we use the following four SNR values: 5, 15, 25 and 35 dB. The accuracy of the proposed CS-approach is examined using a 7- and 15-band MDPSS based dictionaries against the CS-approach based on DPSS. The accuracy metric is the MSE value defined by Equation (30) and 1,000 realizations are used to obtain its values.

The third experiment examines the effects of non-uniform sampling times on the overall performance of the CS-based schemes. In particular, we use 100 non-uniform samples and the SNR values were incremented by 1 dB from 0 to 30 dB. Also, the normalized half-bandwidth is varied in 0.025 increments from 0.30 to 0.375. The accuracy of the proposed approach based on MDPSS is compared against the CS-approach based on DPSS. Specifically, we use 7- and 15-band MDPSS-based time-frequency dictionaries. The accuracy metric is again the MSE value defined by Equation (30). 1,000 realizations are used again to obtain the MSE values, and for each realization new 100 time positions are achieved.

4.2 Swallowing accelerometry signals

Using the proposed scheme, we analyze how accurately we can recover dual-axis swallowing accelerometry signals from sparse samples. Specifically, we assume two different scenarios: only 30% of the original samples are available and only 50% of the original samples are available. In both cases, we examine whether the uniform or non-uniform sub-Nyquist rates have significant effects on the overall effectiveness of the proposed scheme. In this numerical experiment, we use a 10-band MDPSS based dictionary with the normalized half-bandwidth equal to 0.15. To evaluate the effectiveness of the proposed approach when considering dual-axis swallowing accelerometry signals, we adopted performance metrics used in other biomedical applications (e.g., [5, 55, 56]). Those metrics are:

  • Cross-correlation (CC): CC is used to evaluate the similarity between the original and the reconstructed signal, and is defined as:

    CC = n = 1 N ( x ( n ) - μ x ) ( x ^ ( n ) - μ x ^ ) n = 1 N ( x ( n ) - μ x ) 2 n = 1 N ( x ^ ( n ) - μ x ^ ) 2 × 100 %
    (32)

    where x(n) is the original signal and x ^ ( n ) represents a reconstructed signal. In addition, µ x and μ x ^ denote the mean values of x(n) and x ^ ( n ) , respectively.

  • Percent root difference (PRD): PRD measures distortion in reconstructed biomedical signals, and is defined as:

    PRD ( % ) = n = 1 N ( x ( n ) - x ^ ( n ) ) 2 n = 1 N x 2 ( n ) × 100 %
    (33)
  • Root mean square error (RMSE): RMSE also measures distortion and is often beneficial to minimize this metric when finding the optimal approximation of the signal. RMSE is defined as:

    RMSE = n = 1 N ( x ( n ) - x ^ ( n ) ) 2 N
    (34)
  • Maximum error (MAXERR): MAXERR is used to understand the local distortions in the reconstructed signal, and it particularly denotes the largest error between the samples of the original signal and the reconstructed signal. The metric is defined as:

    MAXERR = max ( x ( n ) - x ^ ( n ) )
    (35)

In order to establish statistical significance of our results, a non-parametric inferential statistical method known as the Mann-Whitney test was used [57], which assesses whether observed samples are drawn from a single population (i.e., the null hypothesis). For multi-group testing, the extension of the Mann-Whitney test known as the Kruskal-Wallis was used [58]. A 5% significance was used.

5 Results and discussion

In this section, we present the results of numerical experiments and discuss those results. First, we will discuss the results based on the synthetic test signals. In the second part, we will discuss the results of numerical experiments considering the application of the proposed approach to dual-axis swallowing accelerometry signals.

5.1 Synthetic test signals

The results of the first numerical experiment are shown in Figure 3. Several observations are in order. First, the proposed approach for CS based on the time-frequency dictionary containing MDPSS achieved more accurate signal reconstructions than the CS approach based on DPSS. This can be observed regardless of the initial bandwidth used for discrete prolate sequences. Second, the CS approaches based on both MDPSS and DPSS bases provide similar accuracy at very low SNR values (e.g., SNR < 5 dB), which is consisted with previous publications which showed that the accuracy of CS diminishes as SNR decreases [59].

Figure 3
figure 3

The effects of increasing initial bandwidth of discrete prolate sequences: (a) W = 0.300; (b) W = 0.325; (c) W = 0.350; (d) W = 0.375. The dashed lines denotes MSE obtained with the DPSS; the solid line indicates MSE obtained with a 15-band MDPSS-based dictionary; and the solid line with squares denotes a 7-band MDPSS-based dictionary.

The results of the second simulation are shown in Figure 4. As expected, CS approaches based on MDPSS and DPSS have similar accuracies for a low SNR value (i.e., SNR = 5 dB) as shown in Figure 4a. Both types of bases (i.e., MDPSS and DPSS) are not suitable for accurate representations of random variables, and possibly dictionaries based on random bases would be a more suitable approach for low SNR values. As SNR increases, the MSE decreases for both approaches and the CS approach based on MDPSS obtains higher accuracy. The results also showed that if the percent of available samples is below 30 (i.e., we are acquiring signals at rates that are 30% of the original Nyquist rate), the DPSS and MDPSS based schemes achieve similar accuracy.

Figure 4
figure 4

Increasing number of samples used in CS while altering the SNR values: (a) SNR = 5 dB; (b) SNR = 15 dB; (c) SNR = 25 dB; and (d) SNR = 35 dB. The dashed lines denotes MSE obtained with the DPSS; the solid line indicates MSE obtained with a 15-band MDPSS-based dictionary; and the solid line with squares denotes a 7-band MDPSS-based dictionary.

The results of third numerical experiment are summarized in Figure 5. They clearly depict the advantage of the CS approach based on the MDPSS over the approach based on DPSS even non-uniform sampling is used. For all four considered cases, we achieved more accurate results with MDPSS than with DPSS. Additionally, more accurate results are achieved when we use a 15-band dictionary than the 7-band dictionary. This is in accordance with the previous results shown in Figure 3, which also showed that more comprehensive dictionaries can provide more accurate results due to the fact that they can account for many different time-varying bandwidth scenarios.

Figure 5
figure 5

The effects of random time positions of samples on the accuracy of the proposed scheme while altering the bandwidth of discrete prolate sequences: (a) W = 0.300; (b) W = 0.325; (c) W = 0.350; (d) W = 0.375. The dashed lines denotes MSE obtained with the DPSS; the solid line indicates MSE obtained with a 15-band MDPSS-based dictionary; and the solid line with squares denotes a 7-band MDPSS-based dictionary.

5.2 CS of swallowing accelerometry signals

Tables 1, 2, 3 and 4 depict the results of the numerical analysis when the proposed scheme is applied to dual-axis swallowing accelerometry signals. Sample signals are shown in Figure 6.

Table 1 Performance of the proposed method for recovery of dual-axis swallowing accleremetry signals when considering 30% of samples and a uniform sampling scheme
Table 2 Performance of the proposed method for recovery of dual-axis swallowing accleremetry signals when considering 30% of samples and a non-uniform sampling scheme
Table 3 Performance of the proposed method for recovery of dual-axis swallowing accleremetry signals when considering 50% of samples and a uniform sampling scheme
Table 4 Performance of the proposed method for recovery of dual-axis swallowing accleremetry signals when considering 50% of samples and a non-uniform sampling scheme
Figure 6
figure 6

Sample wet swallow from a healthy participant: (a) the original signal in the A-P direction; (b) the original signal in the S-I direction; (c) the recovered signal in the A-P direction (50% samples, CC = 99.7%); (d) the recovered signal in the S-I direction (50% samples, CC = 99.8%); (e) the error between the original and the recovered signal in the A-P direction; (f) the error between the original and the recovered signal in the S-I direction.

Several observations are in order. First, we achieved very high agreement between the reconstructed data and the original signals with uniformly spread out samples. Statistically higher results were achieved with 50% of samples than with 30% of samples when considering the CCs results (p << 0.01), which resulted in statistically lower errors with 50% of samples when considering the three error metrics (p << 0.01).

Second, statistically worse results have been obtained when using non-uniform (random) sampling times (p << 0.01) in comparison to uniform sampling for both 30% of samples and 50% of samples. This result is expected, as it becomes more challenging to recover the signal accurately with non-uniform samples. Additionally, it is difficult to recover swallowing vibrations accurately, given that these vibrations are short-duration transients. Unless the non-uniform samples capture the behavior of these short-duration transients, a larger recovery error is achieved. However, with 50% of samples, we still obtain very high agreement between the recovered data and the original signals. As a matter of fact, the results obtained with 50% of samples with non-uniform sampling are comparable to the results obtained with 30% of samples when using uniform sampling.

Third, amongst the considered swallowing tasks, dry swallows tend to be recovered most accurately, followed by the wet swallows and lastly by the wet chin down swallows. From a physiological point of view, this is expected since during the dry swallowing manoeuver only small amounts of liquid (i.e., saliva) are swallowed. It is also expected that wet chin down swallows will be more difficult to recover due to the complex maneuvering required during these swallows, which may introduce signal components otherwise not present during the dry and/or wet swallowing tasks.

Therefore, based on the presented results, we can state with high confidence that CS based on the time-frequency dictionary containing MDPSS is suitable scheme for dual-axis swallowing acceleromtry signals. Particularly accurate results have been obtained when we use 50% of samples. We expect that further improvements can be achieved by optimizing the parameters of the recovery process with respect to the considered error metrics.

6 Conclusion

In this article, a CS algorithm for accurate reconstruction of dual-axis swallowing accelerome-try signals from sparse samples was proposed. The proposed algorithm uses a time-frequency dictionary based on MDPSS. The modulating of DPSS was performed in order to account for the time-varying nature of the dual-axis swallowing accelerometry signals. The proposed CS algorithm was tested using both synthetic test signals and swallowing accelerometry signals. In both cases, we achieved very accurate representations with MDPSS, which makes these bases suitable for CS approaches of swallowing accelerometry signals. Specifically, we showed that even when the dual-axis swallowing accelerometry signals were subsampled at by 50% below the Nyquist rate, we still achieved very accurate representations of these signals.