1 Introduction

A throat polyp is an abnormal growth of tissue that grows in the throat which is often referred to as vocal polyps because the majority of these growths are located on the vocal cords. The presence of polyps in the throat is often non-life threatening but can also become cancerous, so a doctor should perform a biopsy. It is common to have throat polyps and to be completely unaware of them, particularly if they are fairly small. These polyps then break off and disappear inside the body or clear up by themselves. However, throat polyps can increase in size to the extent that they affect a person's ability to speak.

The position of these polyps will determine the effect on the voice. They can change the pitch of the voice so that it becomes lower than normal or cause the voice to sound hoarse and croaky.

The traditional methods of throat polyp diagnosis are indirect laryngoscope, video-laryngoscope, and stroboscope light [1]. These methods need special instrument and depend on the experience of the pathologists. Also, the patients will feel uncomfortable pain usually. According to the symptoms of throat polyps, it can be seen that the throat polyps would be detected based on the change of patient voices. In [2], the statistical characteristic root-mean square-delay spread and standard deviation were employed to describe the speech frequency domain characteristic and used as two antecedents. The Fuzzy logic system was used to make polyp patients' diagnosis. The results demonstrated that the proposed method could detect the throat polyps with low probability of miss detection and 0% false alarm rate. In [1], two fuzzy classifiers and a Bayesian classifier were designed for throat polyp detection based on patient vowel voices/a:/and/i:/. Also, the experimental results showed that an interval type-2 fuzzy classifier performed the better. Meanwhile, short-time Fourier transform and singular value decomposition were applied to the vowel voice samples and the power decay rate could be observed. It can be used as an identifier in throat polyp detection. In [3], some methods of speech analysis for the diagnosis of the laryngeal function have been discussed. Also, a new scheme for the diagnosis of the voice using parametric method has been presented because that the existing methods cannot be applied in clinical case. In [4], a novel algorithm based on Daubechies' discrete wavelet transform, linear prediction coefficients, and least squares support vector machines to identify laryngeal pathologies by the digital analysis of the voice has been described. The experimental results showed that the proposed approach led to an adequate larynx pathology classifier to identify nodules in vocal folds and it presented over 90% of classification accuracy. However, more voice data should be sampled in order to reach a better diagnosis result. It becomes increasing difficult due to the sampling bandwidth and storage space constraints. In this paper, we will use the separable compressive sensing and singular value decomposition (SVD) to detect the throat polyps based on support vector machine (SVM) algorithm while reducing the burden of voice data collection and storage.

Compressive sensing (CS) has attracted the interest of theoreticians and practitioners which was proposed by Donoho and Candès in 2006 [5, 6]. In contrast to the common framework of first collecting as much data as possible and then discarding the redundant data by digital compression techniques, CS can minimize the collection of redundant data in the acquisition step [7]. Then, the original signal could be reconstructed from the low-dimensional information with solving an optimization problem. It means that the compressed sampled data contained the main features of the original signal. However, the bottleneck of CS for big data collection is due to the size of random measurement matrix which requires a huge of storage and creates a tremendous computations burden. To overcome the difficulty, a separable compressive sensing operator was proposed and the cost of it manifested as a small increase of the number of compressive samples but remedied sensing matrix storage and computation complexity [7, 8].

According to the merits of separable compressive sensing in big data collection, an intelligent throat polyp detection algorithm based on vowel voices analysis of the patients with separable compressing sensing is proposed in this paper. The remainder of the paper is organized as follows. In the Section 2, the throat polyp detection method with SVD and SVM will be deduced. Then, the theory of separable compressive sensing will be introduced in Section 3. Experimental results of throat polyp detection with the proposed procedure will be showed and analyzed in Section 4. Section 5 is the conclusion and discussion.

2 Intelligent throat polyp detection algorithm

A speech usually consists of vowels and voiced consonants. The voiced consonants cannot produce vocal cord vibration because they come from the vibration of lips and teeth. Meanwhile, the multi-vowel in one speech samples can result in an aliasing in spectrum map of the samples. Consequently, we only use the vowels to investigate the throat polyp detection methods based on voices of patients, especially vowels /a:/ and /i:/ in this paper.

2.1 Singular value decomposition of vowel voices

A successive speech of vowel /a:/ or /i:/ usually has periodic change characteristics every once in a while. Thus, we can construct a P × Q matrix D to express the whole successive speech, where the length of each vowel voice segment is Q with P segments.

SVD is one of a number of effective numerical analysis tools used to analyze matrices. A SVD of the matrix D is its factorization of the form

D = US V T
(1)

where U is a P × P orthogonal matrix, V is a Q × Q orthogonal matrix, while S is a P × Q diagonal matrix with non-negative diagonal elements λ i . The quantities S i are called singular values of the matrix D. The singular values contain very valuable information about the properties of the matrix [911].

In order to eliminate the effect of small diagonal elements for throat polyp detection, we only chose the main K elements and normalized them with followed formula.

s i = λ i 2 / i = 1 K λ i 2
(2)

Therefore, the main K singular values will varied when a person suffer throat polyps. In other words, they can be analyzed for throat polyp detection with intelligent algorithms. We can choose the parameter K when the sum of K singular values is nearly be same with the sum of all singular values.

2.2 Intelligent throat polyp detection based on SVM

SVM is a relatively new computational learning method based on the statistical learning theory which is based on Vapnik-Chervonenkis theory. The SVM is based on the structural risk minimization principle rooted in the statistical learning theory and is a popular machine learning method for classification, which has the potential to handle very large feature spaces.

Given a training set of L data points (f(x i ),x),= 1,2,3.....L where x i RK is the K singular values of a vowel voice. For throat polyp detection, each of class associate with labels be f(x i ) = 1 for the class with throat polyps and f(x i ) = -1 for the class without throat polyps. The classifier is constructed as follows. One assumes that [12, 13]

ω T ϕ x i + b 1 , if f ( x i ) = 1 ,
ω T ϕ x i + b 1 , if f ( x i ) = - 1 ,
(3)

which is equivalent to

f x i ω T ϕ x i + b 1 , i = 1 , 2 , ...... L ,
(4)

where φ is a nonlinear function which maps the input space into a higher dimensional space, and the vector φ is used to define the position of separating hyperplane. However, function (4) is not explicitly constructed. In order to obtain the separating hyper plane in the higher dimensional space, variables ξ i are introduced to solve the following primal optimization problem

min ω , b , ξ 1 2 ω T ω + C I = 1 N ξ i
subject to y i ω T ϕ x i + b 1 - ξ i
(5)

Through the successive training data, we can obtain the support vectors and kernel parameters in the model for prediction.

In this paper, we extract a successive K singular values of vowel voices /a:/ and /i:/ from patients with throat polyps and without throat polyps. Some data are used as training data for establishing classification model for throat polyp detection with SVM algorithm.

3 Background on separable compressive sensing

Compressive sensing is an emerging theory that is based on the fact that a relatively small number of random projections of a signal can contain most of its salient information.

Given that an observation of an vowel voice xRN, it can be expressed as:

x = Φθ ,
(6)

where θ∈RN is the expansion coefficients vector under the orthonormal basis Φ. If θ has only U ≤ N nonzero coefficients, we can say that signal x is U-sparse.

If a length-N signal that is U-sparse in some basis, then it can be recovered exactly/approximately from a non-adaptive linear projection of the signal onto a random projection basis according to CS theory. In matrix notation, it can be described as follows [1425]:

y = Ψx ,
(7)

where y∈RM and Ψ is an M × N random matrix. The appeal of CS is that we only need to collect M = O(K log(N/U)) random measurements to recover the signal x by solving the l 0 -norm constrained optimization problem.

For large N, the storage of random measurement matrix ΨRM × N and computational solving (7) are hardly possible. In the paper, we will show that the difficulties could be solved by using a random measurement matrix which is separable in two dimensions.

Considering the random projection matrix Ψ that is two dimensions separable, we can rewrite the Equation (7) as a two-dimeional (2-D) separable transform [7, 8]:

Y = Ψ x X Ψ y T ,
(8)

where X∈RP×Q is the 2-D representation of voice signal x with N = P × Q. The random matrix Ψ can be given by the Kronecker product of matrix Ψ x Rm × P and Ψ y Rn × Q, Ψ = Ψ x  ⊗ Ψ y . Note that random matrix Ψ is determined by as little as (m × P + n × Q) entries compared to random non-separable matrix which is determined by (M × N) entries. It can alleviate the storage and computation burden of random matrix obviously.

Further, the orthonormal basis Φ also can be given by the Kronecker product. Then, Equation (8) can be further changed as:

Y = Ψ x Φ x Θ Φ y T Ψ y T ,
(9)

where Φ = Φ x  ⊗ Φ y , Θ is the 2-D representation of the expansion coefficients vector θ. We can rewrite the reconstruction formula based on the recovery theory of compressive sensing.

min Θ 1 subject to Y = Ψ x Φ x Θ Φ y T ( Ψ y ) T .
(10)

The provable success of separable CS for signal reconstruction demonstrated that the collected low-dimensional measurements Y contained the main features of the original signal. In other words, we may detect the throat polyps with compressed samples which are collected by separable compressive sensing based on the proposed intelligent algorithm.

4 Experimental results and analysis

In the experiments, vowel /a:/ and /i:/ voice signals of 26 patients were collected which 13 patients have throat polyps and 13 patients did not have throat polyps. The main 20 singular values were chosen as the features. The C-SVM program proposed by Dr. Lin was used for the classification and throat polyp prediction.

Firstly, we tested the classification effect for different P and Q with the proposed intelligent detection algorithm based on SVD and SVM. The features of eight patients without throat polyps and eight patients with throat polyps were used as training data for creating the prediction model. The other features of ten patients were used to test the performance of the detection method. The experimental results were showed in Figures 1 and 2.

Figure 1
figure 1

Correct rate of throat polyp prediction under different P with Q= 500.

Figure 2
figure 2

Correct rate of throat polyp prediction under different Q with P= 2,000.

Figures 1 and 2 showed the prediction results of throat polyp detection under different P with stationary Q and different Q with stationary P. It can be seen that the size of matrix D has influence on the correct rate of prediction. The correct rate will be higher with the longer the length of row, because we could get a more complete period of vowel voices when the length of row in matrix D. In the other hand, the correct rate of prediction was fluctuant with increasing length of column in matrix D. It was because the length of column was influence on singular values. Therefore, we should collect the vowel voice signals with a complete period of voices to improve the correct rate of prediction.

Secondly, we used the separable compressive sensing theory to obtain the compressed vowel voice signals and predicted the classification results with the proposed intelligent algorithm. We set up the matrix D with the size of 1,024 × 1,024 and test the correct rate of prediction with the compressed data matrix under different compression ratio. The results were showed in Figures 3 and 4.

Figure 3
figure 3

Correct rate of throat polyp prediction under different random measurement matrix. With the size of compressed data matrix 512 × 512.

Figure 4
figure 4

Correct rate of throat polyp detection under different compression ratio.

According to the separable compressive sensing theory, we can see that the measurement matrix is random. Therefore, the correct rate of prediction for compressed data may not be the same under different random measurement matrix. We predicted the detection results under ten different random measurement matrixes when the size of compressed data was 512 × 512. Figure 3 showed the prediction results. It could be seen that the mean value of correct rate of prediction was about 50% with small fluctuations. It manifested that the singular values used for training and testing with SVM were similar though obtained with different random measurement matrixes. Moreover, the less training samples also caused the fluctuation of low correct rate of prediction.

Figure 4 showed the prediction results of different compressed ratio, and all correct rates of prediction were the mean value of five detection results under different random measurement matrix. We also could see that the correct rate of prediction was about 50% for different compressed ratio while we thought that the fluctuations might be caused by the few training data. It demonstrated that the intelligent throat polyp detection method with compressed data was effective even with very few data. It conquered the storage and computation burden for throat polyp detection in big data domain.

5 Conclusions

Big data is the term of a collection of large, complex, longitudinal, and distributed data sets that it becomes difficult to process using on-hand database management tools or traditional data processing applications. Compressive sensing theory can sample and compress data simultaneously which provide a new approach for big data classification.

In this paper, we proposed an intelligent method to detect throat polyps with SVD and SVM based on the vowel voices of the patients. Due to the burden of voice signals in storage and computation, we used the separable compressive sensing theory for data compressing and sampling. Then, the throat polyp detection was carried out in the compressed domain. The experimental results demonstrated that the performance of the prediction was stable.