Digital Auscultation Analysis for Heart Murmur Detection
 3.4k Downloads
 34 Citations
Abstract
This work presents a comparison of different approaches for the detection of murmurs from phonocardiographic signals. Taking into account the variability of the phonocardiographic signals induced by valve disorders, three families of features were analyzed: (a) timevarying & time–frequency features; (b) perceptual; and (c) fractal features. With the aim of improving the performance of the system, the accuracy of the system was tested using several combinations of the aforementioned families of parameters. In the second stage, the main components extracted from each family were combined together with the goal of improving the accuracy of the system. The contribution of each family of features extracted was evaluated by means of a simple knearest neighbors classifier, showing that fractal features provide the best accuracy (97.17%), followed by timevarying & time–frequency (95.28%), and perceptual features (88.7%). However, an accuracy around 94% can be reached just by using the two main features of the fractal family; therefore, considering the difficulties related to the automatic intrabeat segmentation needed for spectral and perceptual features, this scheme becomes an interesting alternative. The conclusion is that fractal type features were the most robust family of parameters (in the sense of accuracy vs. computational load) for the automatic detection of murmurs. This work was carried out using a database that contains 164 phonocardiographic recordings (81 normal and 83 records with murmurs). The database was segmented to extract 360 representative individual beats (180 per class).
Keywords
Digital auscultation Heart sounds Phonocardiography Murmur detection Feature extraction Spectrograms Complexity analysisAbbreviations and Terminology
 PCG
Phonocardiographic signal
 MFCC
MelFrequency Cepstral Coefficients
 STFT
Short Time Fourier Transform
 SFFS
Stepwise Floating Forward Selection
 FT
Fourier Transform
 WVD
Wigner–Ville Distribution
 GT
Gabor Transform
 WT
Wavelet Transform
 D_{2}
Correlation dimension
 E_{S}
Systole
 E_{D}
Diastole
 ROC
Receiver Operating Characteristic
 AUC
Area Under the ROC curve
 SE
Standard Error
 knn
kNearest Neighbors
Introduction
Cardiac mechanical activity is appraised by auscultation and processing of heart sound recordings (known as phonocardiographic signals—PCG), which is an inexpensive and noninvasive procedure. The importance of classic auscultation has decreased due to its inherent restrictions: the performance of human ear with its physical limitations, the subjectivity of the examiner, difficult skills that take years to acquire and refine, etc. Anyway, the PCG has preserved its importance in pediatric cardiology, cardiology, and internal diseases,34 evaluating congenital cardiac defects,33 and primary home health care, where an intelligent stethoscope with decision support abilities would be valuable.2,33
The automatic detection of murmurs strongly depends on the appropriate features (data representation), which mostly are related to timing, morphology, and spectral properties of heart sounds.35 Although cardiac murmurs are nonstationary signals and exhibit sudden frequency changes and transients,10 it is common to assume linearity of the feature sets extracted from heart sounds (time and spectral features, frequency representation with time resolution, and parametric modeling25,36). To capture nonstationary transients and fast changes of PCG, the time–frequency features are widely used in heart sound analysis.31 Different approaches have been proposed to deal with that nonstationary nature. In Jeharon et al.,16 an expert system was trained with spectrograms and energy features using clinic knowledge for an effective rule codification. Anyhow, the main restriction is how to choose the relevant features that adequately represent cardiac dynamics. In Javed et al.,15 the features extracted from individual systolic and diastolic intervals using the spectrogram were classified depending on their position within the cardiac cycle using the Wigner–Ville distribution (WVD) obtaining accuracy around 86.4%. In Debbal and BereksiReguig,8 several spectral techniques were used to process the heart sounds, among them: Fourier transform, shorttime Fourier transform, WVD, and continuous wavelet transform, the last one being the most successful, although the wavelet decomposition of PCG tends to produce a wrong time location of the spectral components.11 In Leung et al.,22 using time–frequency features, a sensitivity of 97.3% and a specificity of 94.4% was reported for systolic murmur detection using a neural network based classifier.
On the other hand, the human heart produces lowfrequency sounds (20–1000 Hz), and therefore in former studies,2,17 the PCG has been characterized by means of perceptual analysis, using wellknown techniques from speech processing such as MelFrequency Cepstral Coefficients (MFCC).37
To provide a robust representation of sounds in an automatic heart disease diagnosis system, a Melscaled wavelet transform was developed in Wang et al. 37 It combines the advantages of linear perceptual scale by Melmapping with the suitability of analyzing nonstationary signals by means of wavelet transform. In Johnson et al.,17 the MFCC representation is proposed in combination with Principal Component Analysis (PCA) to interpret the acoustic information, showing a fair performance (specificity 72.4%, and sensitivity 63.4%). In Telatar and Erogul,34 Waveform Similarity OverlapAdd and Multiresolution Time Scale Modification algorithms were used for the diagnosis of cardiac disorders, yet results are not suitable due to high nonconsistent nature of murmurs.
The aforementioned features indirectly take into account the variability of the PCG induced by the murmurs, rather than to characterize the dynamic behavior of the acoustic recordings. Since heart sounds contain nonlinear and nonGaussian information, the dynamic behavior is not revealed directly in the spectral components.9 In this sense, features inspired in higherorder statistics, chaos theory, and fractal complexity have been proposed to describe such behavior of the PCG signal,2 taking into account that many diseases are described by less complex dynamics than those observed under healthy conditions (e.g., in ECG signals), although in PCG signals the opposite takes place, because a more intense murmur originates from a more complex flow.14 The complexity is referred specifically to a multiscale, fractal type of variability in structure or function. For complex processes, fractal longrange correlations produce a kind of memory effect, so the value of some variables (e.g., heart beat at a particular instant) is related not just to the immediately preceding values, but to fluctuations in the remote past. In this sense, certain pathologies are marked by a breakdown of this longrange organization property, producing an uncorrelated randomness similar to white noise. Cardiac interbeat intervals in PCG signals normally fluctuate in a complex, apparently in an erratic manner, even in individuals at rest.12 This highly irregular behavior confronts with a conventional analysis that requires stationary datasets, and fractal analysis is a good candidate for studying this type of time series where fluctuations on multiple time scales take place.
In Wang et al.,38 a detailed analysis was carried out using spectral and energy features together (MFCC, ShortTime Fourier Transform (STFT), and instantaneous energy) for murmurs detection, using a Hidden Markov Modelsbased classifier, obtaining a sensitivity of 95.2% and specificity of 95.3%.
To find the feature subset that minimizes the classification error, a feature extraction stage should be used. In Ahlstrom et al.,2 a feature extraction procedure was proposed for systolic heart murmurs classification using 207 features, such as Shannon energy, wavelet transform, fractal dimensions, and recurrence quantification analysis. After dimensionality reduction using the Stepwise Floating Forward Selection (SFFS) method, a multidomain subset consisting of 14 features was calculated. Using a neural networkbased classifier, the selected multidomain subset gave 86% of detection accuracy for mitral insufficiency, aortic stenosis, and physiological murmurs.
The main goals of the present paper are: (a) the extraction of timevarying, time–frequency, perceptual, and fractaltype features taking into account directly the variability of the PCG induced by the murmurs originated by valve pathologies; (b) the comparison and evaluation of the best feature set suitable for the classification of heart murmurs. To achieve these goals this paper proposes to characterize directly the dynamical behavior of the cardiac sound signal, specifically, fractaltype features for heart murmur detection. The main purpose is to generate a feature set that represents correctly the dynamics of the PCG signal, to detect pathologies, and then increasing the accuracy of the algorithms. Consequently, we used a simple knn classifier, since the aim is to emphasize the characterization and representation rather than the classification stage.
The paper is organized as follows: the section “Background” presents the background about timevarying and time–frequency analysis, the perceptual study, and the complexity based on fractal features. Section “Experimental Outline” refers to the experimental framework, describing the database and methodology used. In section “Experimental Results”, the results obtained are presented in comparison with different feature sets. Finally, in the last section the conclusions are exposed.
Background
Throughout this work, three families of parameters have been considered to parameterize the PCG beats: (a) timevarying & time–frequency; (b) perceptual; and (c) fractal features. A brief review of these families is given next.
TimeVarying and Time–Frequency Analysis (TV&TF)
In the STFT the window length remains constant. Therefore, the extraction of information with fast changes in time (i.e., highfrequency values), must be accomplished with short and welltimed localized intervals, but not in the whole interval of definition of the PCG signal. And vice versa, lowfrequency components involve large time intervals of analysis.
An alternative to the WVD and STFT is the Gabor transform (GT^{1}), a signal decomposition method that uses frequencymodulated Gaussian functions. Because Gaussian functions are optimally concentrated in the joint time–frequency plane, the comparisons reflect a signal’s behavior in local time and frequency. GT has been found to be a good alternative for the STFTbased spectrogram and WVD. However, regardless of windowing functions, and given a value of window width, the spatial resolution remains constant and it is bounded by the time window aperture.
Perceptual Analysis
The human heart produces sounds with lowfrequency components (20–1000 Hz), a large dynamic range, and changing content. On the other hand, psychophysical studies have shown that human perception of the frequency content of audio sounds does not follow a linear scale but also a Melwarped frequency, which spaces linearly for lowfrequency contents and logarithmically at high frequencies to capture important characteristics from audible sounds.24 This fact makes the conventional frequencydomain algorithms unable to reveal the spectral and temporal acoustic information of heart sounds. In this sense, the MFCC parameters, which are grounded on the perceptual analysis of sound, were used before to extract features from PCG signals providing a good performance in typical heart sound.38 Thus, to simplify the PCG spectrum without any significant loss of data, a set of triangular bandpass filters were used, which are nonuniform in the original spectrum and uniformly distributed at the Melwarped spectrum. Each filter is multiplied by the spectrum so that only a single value of magnitude is returned per filter.
Fractal Analysis
In this work, the correlation dimension, the Largest Lyapunov Exponent, and the Hurst exponent have been considered as in Ahlstrom et al. 2 A brief description of these features is given next.
Correlation Dimension (D _{2})
Largest Lyapunov Exponent—LLE (λ _{1})
Hurst Exponent (H)
Experimental Outline
Database
The database used in this study is made up of 148 deidentified adult subjects, who gave their informed consent, and underwent a medical examination with the approval of the ethical committee. An electronic stethoscope (WelchAllyn ^{ ® } Meditron model) was used to acquire the heart sounds simultaneously with a standard 3lead ECG (the DII derivation was used as a time reference because the QRS complex is clearly defined). Both signals were digitized at 44.1 kHz with 16 bits per sample. Tailored software was developed for recording, monitoring, and editing the heart sounds and ECG signals. Besides, eight recordings corresponding to the four traditional focuses of auscultation (mitral, tricuspid, aortic, and pulmonary areas) were taken for each patient in the phase of postexpiratory and postinspiratory apnea. Each recording lasts approximately 8 s and was obtained with the patient standing in dorsal decubitus position. The recording time could not be extended more because patients suffering cardiac problems are not capable of maintaining both postinspiratory and postexpiratory apnea for a longer period. A diagnosis was carried out for the eight recordings of each patient and the severity of the valve lesion was evaluated by cardiologists according to clinical routine. A set of 50 patients were labeled as normal, while 98 were labeled as exhibiting cardiac murmurs, caused by valve disorders (aortic stenosis, mitral regurgitation, etc). Furthermore, for training and validation of the algorithms, PCG signals labeled as normal and those labeled as murmur were separated, keeping in mind that not necessarily all of the eight recordings of each patient with murmurs were labeled as murmur, because it does not generally appear in all focuses at once. This is why it is necessary to perform the diagnosis in each beat rather than in the whole set of PCG signals acquired from each patient. Then, 360 individual beats were extracted, 180 for each class. The individual beats were picked out as the best from each cardiac sound signal, after a visual and audible inspection by a cardiologists; this was done to select beats without artifacts and other types of noise that can impair the performance of the algorithms. It is important to remark that all focuses and phases (postinspiratory and postexpiratory) were treated equally during the tests done in this study.
Preprocessing and Segmentation
Beat Segmentation
The automatic beat segmentation of PCG recordings demands a considerable effort due to the dependency between the beat structure and the recording focus. As it was quoted above, a murmur can show up into any of the PCG signals taken from auscultation areas. Usually, S1 is more intense than S2 in the mitral and tricuspid focuses, while S2 is more intense than S1 in the aortic and pulmonary areas. Nevertheless, this is not a rule because the opposite can also occur. As a result, this morphological beat variability leads to a serious restriction for automated division into segments of the PCG signal. Therefore, the segmentation algorithm developed is based on the DII lead of the ECG recording, which was used to locate the occurrence of the S1 sounds since the beginning of the first cardiac sound cooccurs with the origin of the respective QRS complex. The detection of R peak in the ECG signal is carried out according to the procedure presented in Sahambi et al.,30 which is based on the WT since the maximum modules and the zero crossing values of this transform correspond to abrupt changes of the signal.
Intrabeat Segmentation
Feature Extraction
Description of the feature set extracted for each beat segmented from the PCG signal
Family of parameters  Description  Segment  # Features 

TV&TF analysis  Maximum of area under the Teager energy between systole and diastole of the WT decomposition db4 and db6 (2 features)  Systole, Diastole  85 
Maximum of volume under the surface between systole and diastole of STFT, GT, and WT (3 features)  Systole, Diastole  
Eigenspace representation of STFT, GT, WT, and WVD, picking 20 principal components for each representation (80 features)  Beat  
Perceptual analysis  Mean of the 14 MFCC coefficients extracted for each intrabeat event (56 features)  S1, Systole, S2, Diastole  61 
Relative MFCC calculated as maximum between 4th MFCC in systole and diastole (1 feature)  Systole, Diastole  
Jitter of each intrabeat event (4 features)  S1, Systole, S2, Diastole  
Fractal analysis  Largest Lyapunov Exponent  Beat  3 
Hurst Exponent  Beat  
Correlation Dimension  Beat  
Total number of features per beat  149 
The details of the feature extraction are given next.
TV&TF Features
As a result, each spectrogram is a twodimensional array that can be considered as an image matrix, A, and therefore, PCA can be used to carry out a conventional eigenspace representation.2 This type of representation encodes efficiently the contiguous time–frequency relations that characterize the dynamics depicted by spectrograms. That is, PCA thrives on the inherent correlations among the matrix entries that represent the variations in the event of interest at specific time–frequency locations. After carrying out eigenspace representation analysis, a total of 20 main components of the matrix A were used to characterize the spectrograms.
On the other hand, the presence of a murmur can be clearly evidenced estimating the instantaneous energy of heart sounds within systole and diastole intervals, but constraining the spectral analysis up to 500 Hz of bandwidth. In this work, a method for fast and simple estimation of instantaneous energy using the Teager energy quadratic operator (10) is proposed. The energy estimation, using Teager algorithm,18 is accomplished by a multiresolution representation of the heart sound,21 using the WT with a Daubechies mother function. This approach facilitates the discrimination between S1, S2, and more complex murmurs.

The overall area of the energy under each beat event, as proposed in Sharif et al.32 Since the energy in systole and diastole is often far less than the energy evaluated over a PCG beat, consequently is better to evaluate the energy values separating systole and diastole. In fact, at present work, it is suggested to evaluate the maximum value between the energy in the systole (Ê _{S}) and diastole (Ê _{D}) intervals, i.e., \( \max \{ \sum\nolimits_n {\hat E_{\text{S}} [n]} ,\sum\nolimits_n {\hat E_{\text{D}} [n]} \} , \) where n is the time index.

Based on the STFT, WT, and GT,20 the overall spectral volume confined under the surface of each transform in the systole and diastole intervals and the maximum value between these two values are computed: \( \max \{ \sum\nolimits_{n_\omega } {\sum\nolimits_n {TFD_{\text{S}} [n_\omega ,n]} } ,\sum\nolimits_{n_\omega } {\sum\nolimits_n {TFD_{\text{D}} [n_\omega ,n]} } \} , \) where TFD _{S} and TFD _{D} are the time–frequency representations of systole and diastole, respectively, and n and n _{ ω } are the time and frequency indexes, respectively. Because the relevance of the features can be related to the scattering of its values, the estimation of the GT is proposed adapting the time aperture depending on the statistical variance of the input random signal, and consequently the interval of estimation outcomes inversely proportional to the standard deviation of signal.

The eigenspace representation analysis is achieved for STFT, GT, WVD, and WT. A total of 20 main components were used as features for each representation.2
Perceptual Features
The estimation of the perceptual features requires the intrabeat segments to be divided into frames of 5 ms, with an overlapping of 30%. The window length was selected under the assumption that the lowest frequency component considered is 20 Hz. The upper frequency considered is 1500 Hz, since some of the murmurs are characterized by highfrequency components. For the estimation of the MFCC parameters, each PCG beat is filtered using 14 triangular filters (M = 14). In view of the fact that the heart murmurs have spectral components very concentrated in the band around 600 Hz (698 Mels), the information from the filter centered in this frequency is the most relevant in this study.37 The final subset of the MFCC that characterizes each intrabeat segment using the perceptual features is calculated averaging in time the parameters extracted for each window.
The averaged MFCCs coefficients were complemented with the jitter estimated for each intrabeat segment. The computation of the jitter is based on a former estimation of the fundamental frequency. The fundamental frequency is calculated for each intrabeat segment (S1, systole, S2, and diastole). As a result, a sequence of fundamental frequency values for each event segment is obtained. From this sequence the jitter is calculated.
Besides, to complement the discriminative capabilities of the MFCC, representation for the systole and diastole segments (where the presence of a murmur is supposed to show the strongest evidence) is suggested to evaluate the absolute difference of their respective estimated coefficients.

14 MFCC mean values for each intrabeat segment.

Relative 4th MFCC: This feature is determined as the maximum value between the 4th MFCC coefficient in the systolic and diastolic segment. The reason to use the 4th coefficient lays on the fact that according to the Sequential Forward Floating Selection Algorithm (SFFS) method with a cost function based on knearest neighbors (knn),3 the most relevant information of the MFCCs stands on the 4th coefficient. Furthermore, the motivation to compute the relative MFCC is that murmurs can appear in systole and/or diastole, and then is a good way to identify it independently from its location in the PCG signal.

Jitter: Fundamental frequency variations across windows containing at least one period of the minor spectral component with significant energy from PCG beat.
Fractal Features
The fractal features are based on nonlinear dynamics and have the ability to quantify the nonlinear behavior of the PCG signal. The use of fractal features is motivated since the dynamics of the system (including the nonstationary behavior) are embedded intrinsically into the attractor, and the measure of complexity in the reconstructed trajectory is able to characterize the dynamics.
On dependence on the range of value D _{2} distinct heart diseases can be identified.7 The calculation of D _{2} is achieved following the method proposed in Rosenstein et al.,28 requiring a previous estimation of the correlation sum, C(r) (16). The function ln{C(r)} vs. ln(r) is evaluated for every PCG beat, estimating its scaling region by the derivation, d{ln[C(r)]}/d{ln(r)}, as well as the respective evolution of D _{2} vs. d, d being the embedding dimension. The scaling region is determined taking into account that in any linear region of the function ln{C(r)}, either slope or derivative dependences tend to be similar. In this case, the estimation of the derivative function is done by calculating the slopes of neighboring points and finding τ = 1 as a proper delay value.
The input arguments of the algorithm used to calculate the scaling region are the values of ln{C(r)} and ln(r), and the output arguments are two values corresponding to the maximum and minimum indexes of the scaling region contained in the vector of values of the axis ln(r). The first step is the estimation of the function d{ln[C(r)]}/d{ln(r)}, through computing the slopes in neighbor points. Next, the similarity of the magnitudes obtained after the differentiation of ln{C(r)} is analyzed, using the standard deviation of each analyzed segment, because the segment with the least deviation is the most similar in magnitude. Finally, the indexes that produce the least dispersion segment in the vectors ln{C(r)} and ln(r) are selected.
The value of H is defined as the slope that is obtained when calculating the average of the R(N)/σ(N) relationship given by (20), for different scales in the data length of the analyzed time series. The algorithm for estimating the Hurst exponent, designed in this work, does not overlap data regions, basically because overlapping regions does not give exact results.26 The size chosen for the analysis regions was in powers of two, beginning with 2^{8} and ending in a smaller or equal size to the total size of the series being analyzed, that is to say: 2^{8}, 2^{16}, 2^{32}, 2^{64}, etc. It is important to highlight that all these determinations were chosen with the purpose of calibrating the algorithm, to improve precision of the results.

Correlation Dimension (D _{2})

Largest Lyapunov Exponent (λ _{1})

Hurst Exponent (H)
Classification and Validation
As explained before, the feature set matrix X, was composed by the time–frequency spectral, perceptual, and fractal characteristics of each beat segment. The dimension of the matrix X was q × s, q being the overall number of beats (q = 360), and s the number of features extracted (s = 149).
Once the feature matrix was estimated, each column of the matrix X is centralized and normalized (i.e., zero mean and ·_{2} = 1). For time varying, time–frequency and perceptual features, following this translation and scaling procedure, they are shifted in such a way that none of these values is below 0. In this way, we compute the logarithm of each entry before proceeding with the classification stage, with the aim of minimizing the intraclass variability and increasing the separability of the classes. The aforementioned procedure is performed because these features are based on the signal’s energy, and the logarithm operator enhances its interpretation.
The automatic identification is carried out in a two classproblem: K1 (normal), and K2 (murmur). For this purpose, a knn classifier3 with k = 9 was used.

Feature vectors belonging to one family of parameters

Feature sets composed by those characteristics that revealed to be the most discriminating from each family. With the aim of finding a feature subset that minimizes the classification error, a heuristic search was carried out. More specifically, a Sequential Forward Floating Selection Algorithm (SFFS) with a cost function based on knn3 was used to select the most significant features.
Validation of classification procedures is achieved according to the methodology suggested in SaenzLechon et al.,29 using a crossvalidation strategy based on several partitions (10 folds) of the whole training dataset. The 70% of the samples in each fold were used for training whereas the remaining 30% were used for testing the algorithms. Since the decision is taken for each beat segment, and the database has been segmented in a beat basis, depending on the strategy followed for segmentation, the training and testing subsets can have feature vectors belonging to the same patients. To compare the robustness of the proposed methods to the intrasubject and intersubject beat variability, the validation was carried out following two different approaches: (a) the training and validation test sets were chosen without taking into account the relationship patientrecording, so the training and testing subsets are different but contain beats of all the patients stored in the database and (b) the validation was performed using recordings of different patients to those used for the training stage.
Experimental Results
The best filtering algorithm was selected using the cardiologist’s expertise after an exhaustive hearing session of PCGfiltered registers. The filtering with the Coiflet WT at level 8 of decomposition according to the method described showed the best performance. Nevertheless, in case of fractalbased feature extraction, it was found experimentally a degradation of the classification performance (it diminishes up to 59.2%) after either denoising or preprocessing filtering, confirming the results presented in Kantz and Schreiber.19
Intrabeat segmentation results
Beat  False negatives  False positives 

Normal  0/720  7/720 
With murmur  0/720  108/720 
Accuracy of the detector for different feature sets without feature extraction using the method of validation (a)
Feature set  Nonfiltered signal  Filtered signal  

Accuracy (%)  AUC  SE  Accuracy (%)  AUC  SE  
TV&TF  94.35  98.21  0.40  95.28  98.29  0.49 
Perceptual  86.85  95.09  1.05  88.7  95.44  1.16 
Fractal  97.17  99.45  0.21  59.2  68.28  3.58 
The AUC is also a measure of the classification accuracy, even though, in this case it is not so precise, because we have a knn classifier that gives discrete scores for each sample, and then, using 9 neighbors, the ROC curve is not smooth enough, since it has only 9 steps according to the scores that form it, independently from the number of samples used in its calculation. As can be seen in Table 3, the AUC value is greater than the accuracy value, but the proportions persist. The SE corresponds to the error of the AUC with a confidence interval of 5%, using a crossvalidation process with 10 folds. The fractal features without filtering were also the ones that gave the least SE, which indicates its robustness for the samples used in the test.
Accuracy of the detector for different feature sets without feature extraction using the method of validation (b)
Feature set  Accuracy (%)  AUC  SE 

TV&TF  87.92  93.34  2.96 
Perceptual  82.03  86.51  3.6 
Fractal  97.73  99.63  0.48 
Accuracy of the detector for different feature sets using the feature extraction algorithm
Feature set  Accuracy (%)  AUC  SE 

All features (s = 149)  96.11  98.88  0.48 
Sp1, Sp2, Sp3, Fr1, Fr2, Ac1 (s = 6)  96.39  99.49  0.22 
Fr1, Fr2 (s = 2)  93.8  98.34  0.83 
Sp1, Sp2, Sp3 (s = 3)  87.5  96.02  1.07 
Sp2, Sp3 (s = 2)  86.48  95.39  1.13 
Table 5 also shows the classification results with different feature sets, obtained by means of the method explained in section “Classification and Validation” used to find relevant features. It can be observed that fractal features have a high discriminative capability, because using only two of this features the classification accuracy was 93.8%; on the other hand, using two spectral features we reached only an accuracy of 86.48%, and grew just to 87.5% when the three most relevant spectral features were used.
Conclusions
To take into account the variability of the PCG induced by the murmurs, it is better to characterize directly the dynamic behavior of the heart sound recordings. According with the experimental results, the fractal features hold the inner structural dynamic of heart sounds. This fact can be explained by the presence of longrange (fractal) correlations, along with distinct classes of nonlinear interactions. As a result, the fractal features applied to the detection of murmurs emerged as the most robust characteristics in the sense of accuracy vs. computational load.
The characterization of the inherent variability of a process must reflect separately each of the possible sources of dynamic behavior (avoiding overlapping). Hence, the variability of the chaos features is linked with the analysis of a complete heartbeat, which defines the periodicity of the cardiac cycle. Concerning the perceptual and spectral features, their estimation may be focused on each biological event, so the four segments per heartbeat ought to be analyzed: S1 sound, systole, S2 sound, and diastole. The resulting performance improves significantly through adjusting the segmentation stage, which is necessary for the successful estimation of perceptual and spectral features. This task becomes more difficult for pathological records. It is important to note that the results obtained using perceptual and spectral features depend strongly of intrabeat segmentation, in fact, if components segmentation is not being well performed, then, the corresponding features barely can be considered as relevant. In this way, employing another reliable technique for intrabeat segmentation, the performance of these features could be improved. Nevertheless, the main problem is the adjustment and tuning of this algorithm, which in the case of fractal features is not necessary at all.
Regarding the feature selection, it is shown that for a reduced feature set (80% of reduction), the performance kept similar to the best one reported in the literature. Nevertheless, almost the same accuracy can be reached by using just two fractal features. Taking into consideration the difficulties above mentioned regarding the beat segmentation to compute the spectral and perceptual features, this extra effort does not seem to be worthy for working out neither with spectral nor with perceptual features.
Footnotes
 1.
The GT is a STFT calculated using a Gaussian window.
Notes
Acknowledgments
This research was carried out under grants: 20201004224 and 20201004208, funded by Universidad Nacional de Colombia, Manizales; Condonable credits from COLCIENCIAS; TEC200612887C02 from the Ministry of Science and Technology of Spain; and AL06EXPID033 from the Universidad Politécnica de Madrid, Spain.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References
 1.Acero A., H. W. Hon (2001). Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Upper Saddle River, NJ: Prentice HallGoogle Scholar
 2.Ahlstrom C., Hult P., Rask P., Karlsson J. E., Nylander E., Dahlstrom U., Ask P. (2006) Feature extraction for systolic heart murmur classification. Ann. Biomed. Eng. 34(11), 1666–1677PubMedCrossRefGoogle Scholar
 3.Alpaydin E. (2004). Introduction to Machine Learning Cambridge, MA: MIT PressGoogle Scholar
 4.Beyar R., Levkovitz S., Braun S., Palti Y. (1984) Heartsound processing by average and variance calculation—physiologic basic and clinical implications. IEEE Trans. Biomed. Eng. BME31(9), 591–596CrossRefGoogle Scholar
 5.Boashash B. (1992) Estimating and interpreting the instantaneous frequency of a signal. II. Algorithms and applications. Proc. IEEE 80(4), 540–568CrossRefGoogle Scholar
 6.Cao L. (1997) Practical method for determining the minimum embedding dimension of a scalar time series. Physica D: Nonlinear Phenom. 110(1–2), 43–50CrossRefGoogle Scholar
 7.Carvajal, R., M. Vallverdu, R. Baranowski, E. OrlowskaBaranowska, J. J. Zebrowski, and P. Caminal. Dynamical nonlinear analysis of heart rate variability in patients with aortic stenosis. In: Proc. Computers in Cardiology, 2002, pp. 449–452Google Scholar
 8.Debbal S. M., BereksiReguig F. (2007) Timefrequency analysis of the first and the second heartbeat sounds. Appl. Math. Comput. 184(2), 1041–1052CrossRefGoogle Scholar
 9.Ergen, B., and Y. Tatar. The analysis of heart sounds based on linear and high order statistical methods. In: Proc. 23rd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol. 3, 2001, pp. 2139–2141Google Scholar
 10.Ergen, B., and Y. Tatar. Timefrequency analysis of phonocardiogram. In: MEASUREMENT 2003, Fourth International Conference on Measurement, 2003, p. 222Google Scholar
 11.Ergen, B., and Y. Tatar. Optimal continuous wavelet analysis of periodogram signals. In: IJCI Proceedings of International Conference on Signal Processing, vol. 1, 2003Google Scholar
 12.Goldberger A. L., Amaral L. A. N., Hausdorff J. M., Ivanov P. C., Peng C. K., Stanley H. E. (2002) Fractal dynamics in physiology: alterations with disease and aging. Proc. Natl. Acad. Sci. USA 99(Suppl 1), 2466–2472PubMedCrossRefGoogle Scholar
 13.Hadjitodorov S., Mitev P. (2002) A computer system for acoustic analysis of pathological voices and laryngeal diseases screening. Med. Eng. Phys. 24(6), 419–429PubMedCrossRefGoogle Scholar
 14.Hoglund K., Ahlstrom C. H. G., Haggstrom J., Ask P. N. A., Hult P. H. P., Kvart C. (2007) Timefrequency and complexity analyses for differentiation of physiologic murmurs from heart murmurs caused by aortic stenosis in Boxers. Am. J. Vet. Res. 68(9), 962–969PubMedCrossRefGoogle Scholar
 15.Javed F., Venkatachalam P. A., Ahmad F. M. (2006) A signal processing module for the analysis of heart sounds and heart murmurs. J. Phys.: Conf. Ser. 34, 1098–1105CrossRefGoogle Scholar
 16.Jeharon, H., H. Jeharon, A. Seagar, and N. Seagar. Feature Extraction from Phonocardiogram for Diagnosis based on Expert System. In: Proc. 27th Annual International Conference of the Engineering in Medicine and Biology Society (IEEEEMBS 2005), 2005, pp. 5479–5482Google Scholar
 17.Johnson M. G., Tewfik A., Madhu K. P., Erdman A. G. (2007) Using voicerecognition technology to eliminate cardiac cycle segmentation in automated heart sound diagnosis. Biomed. Instrum. Technol. 41, 157–166CrossRefGoogle Scholar
 18.Kaiser, J. F. On a simple algorithm to calculate the ‘energy’ of a signal. In: Proc. International Conference on Acoustics, Speech, and Signal Processing (ICASSP90), 1990, pp. 381–384Google Scholar
 19.Kantz H., Schreiber T. (2002). Nonlinear Time Series Analysis. Cambridge: Cambridge University PressGoogle Scholar
 20.Khadra L., Matalgah M., El Asir B., Mawagdeh S. (1991) The wavelet transform and its applications to phonocardiogram signal analysis. Med. Inf. 16(3), 271–277CrossRefGoogle Scholar
 21.Kumar, D., P. Carvalho, M. Antunes, J. Henriques, M. Maldonado, R. Schmidt, and J. Habetha, J. Wavelet transform and simplicity based heart murmur segmentation. In: Proc. Computers in Cardiology, Valencia, Spain, 2006, pp. 173–176Google Scholar
 22.Leung, T. S., P. R. White, W. B. Collis, E. Brown, and A. P. Salmon. Classification of heart sounds using timefrequency method and artificial neural networks. In: Proc. 22nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol. 2, 2000, pp. 988–991Google Scholar
 23.Messer S. R., Agzarian J., Abbott D. (2001) Optimal wavelet denoising for phonocardiograms. Microelectron. J. 32(12), 931–941CrossRefGoogle Scholar
 24.Molau, S., M. Pitz, R. Schluter, and H. Ney. Computing Melfrequency cepstral coefficients on the power spectrum. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ‘01), vol. 1, 2001, pp. 73–76Google Scholar
 25.Ning, T., and K.S. Hsieh, Delineation of systolic murmurs by autoregressive modelling. In: Proc. IEEE 21st Annual Northeast Bioengineering Conference, 1995, pp. 19–21Google Scholar
 26.Peters E. E. (1996). Chaos and Order in the Capital Markets New York: John Wiley and SonsGoogle Scholar
 27.Rangayyan R. M. (2001). Biomedical Signal Analysis: A CaseStudy Approach. New York: WileyIEEE PressGoogle Scholar
 28.Rosenstein M. T., Collins J. J., De Luca C. J. (1993) A practical method for calculating largest Lyapunov exponents from small data sets. Physica D: Nonlinear Phenom. 65(1–2), 117–134CrossRefGoogle Scholar
 29.SaenzLechon N., GodinoLlorente J. I., OsmaRuiz V., GomezVilda P. (2006) Methodological issues in the development of automatic systems for voice pathology detection. Biomed. Signal Process. Control 1(2), 120–128CrossRefGoogle Scholar
 30.Sahambi J. S., Tandon S. N., Bhatt R. K. P. (1997) Using wavelet transforms for ECG characterization. An online digital signal processing system. IEEE Eng. Med. Biol. Mag. 16(1), 77–83PubMedCrossRefGoogle Scholar
 31.Sejdic, E., and J. Jiang, Comparative study of three timefrequency representations with applications to a novel correlation method. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ‘04), vol. 2, 2004, pp. 633–636Google Scholar
 32.Sharif, Z., M. S. Zainal, A. Z. Sha’ameri, and S. H. S. Salleh. Analysis and classification of heart sounds and murmurs based on the instantaneous energy and frequency estimations. In: TENCON 2000. Proceedings, vol. 2, 2000, pp. 130–134Google Scholar
 33.Tavel M. E., Katz H. (2005) Usefulness of a new sound spectral averaging technique to distinguish an innocent systolic murmur from that of aortic stenosis. Am. J. Cardiol. 95(11), 902–904PubMedCrossRefGoogle Scholar
 34.Telatar, Z., and O. Erogul. Heart sounds modification for the diagnosis of cardiac disorders. In: IJCI Proceedings of International Conference on Signal Processing, Çanakkale, vol. 1(2), 2003, pp. 100–105Google Scholar
 35.Tilkian A., Conover M. (2001) Understanding Heart Sounds and Murmurs: With an Introduction to Lung Sounds, 4 ed. Philadelphia: W. B. Saunders CoGoogle Scholar
 36.Voss A., Mix A., Hubner T. (2005) Diagnosing aortic valve stenosis by parameter extraction of heart sound signals. Ann. Biomed. Eng. 33(9), 1167–1174PubMedCrossRefGoogle Scholar
 37.Wang, P., Y. Kim, and C. B. Soh. Feature extraction based on Melscaled wavelet transform for heart sound analysis. In: Proc. 27th Annual International Conference of the Engineering in Medicine and Biology Society (IEEEEMBS 2005), 2005, pp. 7572–7575Google Scholar
 38.Wang P., Lim C. S., Chauhan S., Foo J. Y. A., Anantharaman V. (2007) Phonocardiographic signal analysis method using a modified hidden Markov model. Ann. Biomed. Eng. 35(3), 367–374PubMedCrossRefGoogle Scholar