Sequential Modeling for Polyps Identification from the Vocal Data

  • Fangqi Zhu
  • Qilian Liang
  • Zhen ZhongEmail author
Conference paper
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 516)


Given the revival of neural networks and its recent impact in other disciplines and record-breaking performances in a variety of applications, in this paper, we employed a deep sequential model for polyps detection from the vocal data. Previous research of acoustic signal recognition (ASR) has focused on hand-crafted machine learning fashion, such as Mel-frequency cepstral coefficients with hidden Markov model and Gaussian mixture model. The deep model demonstrates its flexibility and potential to outperform the traditional methods, and we expand its scope on medical symptom identification. The mapping between the raw vocal signal and the symptom recognition is established, and we show that we can achieve a good recognition accuracy, which may appear to clinical diagnosis in the near future.


Vocal features Polyps Sequential model LSTM 



This work was supported in part by NSFC under Grant 61771342, 61731006, 61711530132, and Tianjin Higher Education Creative Team Funds Program.


  1. 1.
    Wang TJ, Massaro JM, Levy D, et al. A risk score for predicting stroke or death in individuals with new-onset atrial fibrillation in the community: the framingham heart study. J Am Med Assoc. 2003;290(8):1049–56.CrossRefGoogle Scholar
  2. 2.
    Rabiner LR. A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE. 1989;77(2):257–86.CrossRefGoogle Scholar
  3. 3.
    Roebroeck A, Seth AK, Valdes-Sosa P. Causal time series analysis of functional magnetic resonance imaging data. In: NIPS mini-symposium on causality in time series, 2011. p. 65–94.Google Scholar
  4. 4.
    Zhu F, Liang J. Soil moisture retrieval from UWB sensor data by leveraging fuzzy logic. IEEE Access, 2018. Scholar
  5. 5.
    Alipanahi B, Delong A, Weirauch MT, et al. Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nat Biotechnol. 2015;33(8):831–9.CrossRefGoogle Scholar
  6. 6.
    Hochreiter S, Schmidhuber J. Long short-term memory. Neural comput. 1997;9(8):1735–80.CrossRefGoogle Scholar
  7. 7.
    Hinton G, Deng L, Yu D, et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag. 2012;29(6):82–97.CrossRefGoogle Scholar
  8. 8.
    Sutskever I, Vinyals O, Le QV. Sequence to sequence learning with neural networks. Adv Neural Inf Process Syst, 2014, p. 3104–12.Google Scholar
  9. 9.
    van der Westhuizen J, Lasenby J. The unreasonable effectiveness of the forget gate. arXiv preprint arXiv:1804.04849, 2018.
  10. 10.
    Tank A, Cover I, Foti NJ, et al. An interpretable and sparse neural network model for nonlinear granger causality discovery. In: Accepted by NIPs time series workshop, 2017.Google Scholar
  11. 11.
    Choi E, Schuetz A, Stewart WF, et al. Using recurrent neural network models for early detection of heart failure onset. J Am Med Inf Assoc. 2016;24(2):361–70.Google Scholar
  12. 12.
    Zhong Z, Jiang T, Zhang W, et al. Analyzing speech of patients with vocal polyps based on channel parameters and fuzzy logic systems. Comput Math Appl. 2011;62(7):2834–42.CrossRefGoogle Scholar
  13. 13.
    Mitra SK, Kuo Y. Digital signal processing: a computer-based approach. New York: McGraw-Hill Higher Education; 2006.Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Department of Electrical EngineeringUniversity of Texas at ArlingtonArlingtonUSA
  2. 2.Department of Otolaryngology Head and Neck SurgeryPeking University First HospitalBeijingChina

Personalised recommendations