Skip to main content

Advertisement

Log in

Cough Sound Analysis Can Rapidly Diagnose Childhood Pneumonia

  • Published:
Annals of Biomedical Engineering Aims and scope Submit manuscript

Abstract

Pneumonia annually kills over 1,800,000 children throughout the world. The vast majority of these deaths occur in resource poor regions such as the sub-Saharan Africa and remote Asia. Prompt diagnosis and proper treatment are essential to prevent these unnecessary deaths. The reliable diagnosis of childhood pneumonia in remote regions is fraught with difficulties arising from the lack of field-deployable imaging and laboratory facilities as well as the scarcity of trained community healthcare workers. In this paper, we present a pioneering class of technology addressing both of these problems. Our approach is centred on the automated analysis of cough and respiratory sounds, collected via microphones that do not require physical contact with subjects. Cough is a cardinal symptom of pneumonia but the current clinical routines used in remote settings do not make use of coughs beyond noting its existence as a screening-in criterion. We hypothesized that cough carries vital information to diagnose pneumonia, and developed mathematical features and a pattern classifier system suited for the task. We collected cough sounds from 91 patients suspected of acute respiratory illness such as pneumonia, bronchiolitis and asthma. Non-contact microphones kept by the patient’s bedside were used for data acquisition. We extracted features such as non-Gaussianity and Mel Cepstra from cough sounds and used them to train a Logistic Regression classifier. We used the clinical diagnosis provided by the paediatric respiratory clinician as the gold standard to train and validate our classifier. The methods proposed in this paper could separate pneumonia from other diseases at a sensitivity and specificity of 94 and 75% respectively, based on parameters extracted from cough sounds alone. The inclusion of other simple measurements such as the presence of fever further increased the performance. These results show that cough sounds indeed carry critical information on the lower respiratory tract, and can be used to diagnose pneumonia. The performance of our method is far superior to those of existing WHO clinical algorithms for resource-poor regions. To the best of our knowledge, this is the first attempt in the world to diagnose pneumonia in humans using cough sound analysis. Our method has the potential to revolutionize the management of childhood pneumonia in remote regions of the world.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4

Similar content being viewed by others

References

  1. Abaza, A. A., J. B. Day, J. S. Reynolds, A. M. Mahmoud, W. T. Goldsmith, W. G. McKinney, E. L. Petsonk, and D. G. Frazer. Classification of voluntary cough sound and airflow patterns for detecting abnormal pulmonary function. Cough 5:8, 2009.

    Article  PubMed  Google Scholar 

  2. Abeyratne, U. Blind Reconstruction of Non-minimum-phase Systems from 1-D Oblique Slices of Bispectrum. IET, 1999.

  3. Bhutta, Z. A. Childhood pneumonia in developing countries: refinement of clinical algorithms is a priority. BMJ 333:612, 2006.

    Article  PubMed  Google Scholar 

  4. Cardoso, M. R. A., C. M. Nascimento-Carvalho, F. Ferrero, F. M. Alves, and S. N. Cousens. Adding fever to WHO criteria for diagnosing pneumonia enhances the ability to identify pneumonia cases among wheezing children. Arch. Dis. Child. 96:58, 2011.

    Article  PubMed  Google Scholar 

  5. Falade, A., H. Tschäppeler, B. M. Greenwood, and E. K. Mulholland. Use of simple clinical signs to predict pneumonia in young Gambian children: the influence of malnutrition. Bull. World Health Organ. 73:299, 1995.

    PubMed  CAS  Google Scholar 

  6. Frosh, A. Pediatric airway: cry, stridor, and cough. JAMA 301:2392–2392 (2009).

    Google Scholar 

  7. Gold, B., N. Morgan, and D. Ellis. Speech and Audio Signal Processing: Processing and Perception of Speech and Music. Wiley-Interscience, 2011.

  8. Harari, M., V. Spooner, S. Meisner, M. Carney, F. Shann, and J. De Campo. Clinical signs of pneumonia in children. Lancet 338:928–930, 1991.

    Article  PubMed  CAS  Google Scholar 

  9. Hirschberg, J., I. Lelli, and Z. Garay. The Pediatric Airway: Cry, Stridor, and Cough. Plural Pub., 2009.

  10. IMCI Integrated Management of Childhood Illness. World Health Organization, 2005.

  11. Knocikova, J., J. Korpas, M. Vrabec, and M. Javorka. Wavelet analysis of voluntary cough sounds in patients with respiratory diseases. J. Physiol. Pharmacol. 59:331–340, 2008.

    PubMed  Google Scholar 

  12. Kosasih, K., U. Abeyratne, and V. Swarnkar. High frequency analysis of cough sounds in pediatric patients with respiratory diseases. In: Engineering in Medicine and Biology Society (EMBC), 2012 Annual International Conference of the IEEE. 2012. IEEE.

  13. Lozano, J., M. Steinhoff, J. Ruiz, M. Mesa, N. Martinez, and B. Dussan. Clinical predictors of acute radiological pneumonia and hypoxaemia at high altitude. Arch. Dis. Child. 71:323–327, 1994.

    Article  PubMed  CAS  Google Scholar 

  14. Mendel, J. M. Tutorial on higher-order statistics (spectra) in signal processing and system theory: theoretical results and some applications. Proc. IEEE 79:278–305, 1991.

    Article  Google Scholar 

  15. Mulholland, E., E. Simoes, M. Costales, E. McGrath, E. Manalac, and S. Gove. Standardized diagnosis of pneumonia in developing countries. Pediatr. Infect. Dis. J. 11:77, 1992.

    Article  PubMed  CAS  Google Scholar 

  16. Ng, A. K., T. S. Koh, E. Baey, T. H. Lee, U. R. Abeyratne, and K. Puvanendran. Could formant frequencies of snore signals be an alternative means for the diagnosis of obstructive sleep apnea? Sleep Med. 9:894–898, 2008.

    Article  PubMed  Google Scholar 

  17. Oppenheim, A. V., R. W. Schafer, and J. R. Buck. Discrete-Time Signal Processing. Vol. 1999. Englewood Cliffs, NJ: Prentice Hall, 1989.

  18. Palafox, M., H. Guiscafré, H. Reyes, O. Muñoz, and H. Martínez. Diagnostic value of tachypnoea in pneumonia defined radiologically. Arch. Dis. Child. 82:41–45, 2000.

    Article  PubMed  CAS  Google Scholar 

  19. Piirilä, P., and A. Sovijärvi. Differences in acoustic and dynamic characteristics of spontaneous cough in pulmonary diseases. Chest 96:46–53, 1989.

    Article  PubMed  Google Scholar 

  20. Pneumonia Strategy Overview. Bill & Melinda Gates Foundation (November 2009).

  21. Smith, J. A., J. E. Earis, and A. A. Woodcock. Establishing a gold standard for manual cough counting: video versus digital audio recordings. Cough 2:6, 2006.

    Article  PubMed  Google Scholar 

  22. Suttorp, N., T. Welte, and R. Marre. Community-Acquired Pneumonia. Springer, 2007.

  23. Swarnkar, V., U. Abeyratne, A. Chang, Y. Amrulloh, A. Setyati, and R. Triasih. Automatic identification of wet and dry cough in pediatric patients with respiratory diseases. Ann. Biomed. Eng. 41, 2013. doi:10.1007/s10439-013-0741-6.

  24. Wardlaw, T., E. W. Johansson, and M. Hodge. Pneumonia: The Forgotten Killer of Children. UNICEF/WHO, 2006.

  25. Thorpe, W., M. Kurver, G. King, and C. Salome. Acoustic analysis of cough. In: Intelligent Information Systems Conference, The Seventh Australian and New Zealand 2001. 2001. IEEE.

  26. WHO/UNICEF. Global Action Plan for the Prevention and Control of Pneumonia (GAPP). WHO Reference Number: WHO/FCH/CAH/NCH/09.04, 23 (2009).

Download references

Acknowledgments

This work was supported by the Bill & Melinda Gates Foundation, USA, under its Grand Challenges in Global Health Explorations Grant program (Grant ID# OPP1008199 GCE to Abeyratne).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Udantha R. Abeyratne.

Additional information

Associate Editor Merryn Tawhai oversaw the review of this article.

Appendix

Appendix

Our method requires the computation of a number of mathematical features from cough sounds. This Appendix describes the features we computed from each sub-segment x i, i = 1, 2, 3 of a recorded cough sound x.

The Bispectrum Score (BGS)

The 3rd order spectrum of a signal is known as the bispectrum.2 Unlike the power spectrum (the 2nd order spectrum based on the autocorrelation), the bispectrum preserves Fourier phase information. The bispectrum \( B_{xi} \left( {\omega_{1} ,\omega_{2} } \right) \) of the segment x i can be estimated from (5) as,

$$ B_{xi} \left( {\omega_{1} ,\omega_{2} } \right) = \mathop \sum \limits_{{\tau_{1 = - \infty } }}^{{\tau_{1 = + \infty } }} \mathop \sum \limits_{{\tau_{2 = - \infty } }}^{{\tau_{2 = + \infty } }} W\left( {\tau_{1} ,\tau_{2} } \right) \cdot C_{xi} \left( {\tau_{1} ,\tau_{2} } \right)e^{{ - j\left( {\tau_{1} \omega_{1} + \tau_{2} \omega_{2} } \right)}} $$
(5)

where \( W\left( {\tau_{1} ,\tau_{2} } \right) \) is a bispectrum window function such as the minimum bispectrum-bias supremum window14 used in this paper, \( C_{xi} \left( {\tau_{1} ,\tau_{2} } \right) \) is the third order cumulants of x i estimated with (6), and, \( \omega_{1} ,\omega_{2} \) denotes digital frequencies.

$$ C_{xi} \left( {\tau_{1} ,\tau_{2} } \right) = \frac{1}{L}\mathop \sum \limits_{k = 0}^{L - 1} x_{i} \left( t \right)x_{i} \left( {t + \tau_{1} } \right)x_{i} \left( {t + \tau_{2} } \right), \left| \quad {\tau_{1} } \right| \le Q,\left| {\tau_{2} } \right| \le Q $$
(6)

In (6), Q is the length of the 3rd order correlation lags considered and x i is a zero-mean signal.

The bispectrum is a 2D signal. However, it can be proven2 that for linear signals, any 1D oblique slice of the bispectrum other than the slices parallel to the axes: ω1 = 0, ω2 = 0 and ω1 + ω2 = 0 carries sufficient information to characterise the entire 2D bispectrum within a phase factor. In this work, we capture the information available in the bispectrum via the diagonal slice P(ω) defined by ω1 = ω2 = ω, i.e. \( P\left( \omega \right) = B_{xi} \left( {\omega , \omega } \right). \)Then the Bispectrum Score (BSG) is computed as defined in (7). In (7) we used k 1 = 90 Hz, k2 = 5 kHz, k3 = 6 kHz and k4 = 10.5 kHz.

$$ BSG = \frac{{\mathop \int \nolimits_{k1}^{k2} |P\left( \omega \right)| \cdot d\omega }}{{\mathop \int \nolimits_{k3}^{k4} |P\left( \omega \right)| \cdot d\omega }} $$
(7)

Non-Gaussianity Score (NGS)

NGS score is a numerical measure of non-Gaussianity of a given segment of data x i . The normal probability plot can be utilized to obtain a visual measure of the Gaussianity of a set of data, and the NGS score is a way of quantifying the non-Gaussianity based on regression analysis. We used (8) to estimate the NGS score, where p and q represents the normal probability plots of the reference normal data and the analysed data (x i ). The symbol N is the number of data points used in the probability plot.

$$ NGS = 1 - \left( {\frac{{\mathop \sum \nolimits_{j = 1}^{N} \left( {q\left[ j \right] - p} \right)^{2} }}{{\mathop \sum \nolimits_{j = 1}^{N} \left( {q\left[ j \right] - \bar{q} } \right)^{2} }}} \right) $$
(8)

Formant Frequencies

In speech analysis, formants frequencies (FF) are referred to as the resonances of the vocal tract.16 In cough analysis, it is reasonable to expect that the resonances of the overall airway that contribute to the generation of a cough sound will be represented in the formant structure. One classic example for this is wheeze. Existence of mucous can also change acoustic properties of airways. We included the first four formants (F1, F2, F3, F4) in our candidate feature set. We computed F1–F4 by peak picking the Linear Predictive Coding (LPC) spectrum of cough segments x i. For this work we used a 14th order LPC model with the parameters determined via the Levinson-Durbin recursive procedure.17

Log Energy (LogE)

The log energy for every sub-segment x i was computed using (9):

$$ \text{LogE} = 10\log_{10} \left( {\varepsilon + \frac{1}{N}\mathop \sum \limits_{k = 1}^{K} \left( {x_{i} \left( t \right)^{2} } \right)} \right) $$
(9)

In (9) ε is an arbitrarily small positive constant added to prevent any inadvertent computation of the logarithm of 0.

Zero Crossing (Zcr)

The number of zero crossings was counted for each sub-segment x i.

Kurtosis (Kurt)

The kurtosis is a measure of how peaky the probability density distribution of x i is. It is the fourth central moment of x i and can be computed using (10), where μ and σ respectively denote the mean and the standard deviation of x i.

$$ {\text{Kurt}} = \frac{{E\{ \left( {x_{i} \left[ k \right] - \mu } \right)^{4} \} }}{{\sigma^{4} }} $$
(10)

Mel-Frequency Cepstral Coefficients (MFCC)

MFCCs have been widely used in speech recognition systems.7,9 MFCC provides some resilience to the non-linguistic sources of variance in speech signals. They also provide orthogonal features making facilitating the training of the classifier. The computation of MFCC involves the estimation of short-term power spectra, mapping to Mel frequency scale and then computing the cepstral coefficients. In our work, we included 12 MFCC coefficients in our feature set.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abeyratne, U.R., Swarnkar, V., Setyati, A. et al. Cough Sound Analysis Can Rapidly Diagnose Childhood Pneumonia. Ann Biomed Eng 41, 2448–2462 (2013). https://doi.org/10.1007/s10439-013-0836-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10439-013-0836-0

Keywords

Navigation