Durian Ripeness Striking Sound Recognition Using N-gram Models with N-best Lists and Majority Voting

  • Rong Phoophuangpairoj
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 265)


Durians are green spiky fruits, which are considered as a delicacy throughout Southeast Asia. They are valued for their unique flavor and powerful taste. It is desirable to be able to determine the quality of durians without cutting them because it is difficult to quantify the ripeness from the external appearance and they are expensive to purchase. In Southeast Asia and China, consumers have found that after buying and cutting durians, they are not ripe or ready to eat. Therefore, studying striking signal characteristics and developing an automated method of recognizing durian ripeness levels without cutting or destroying them could benefit consumers of the fruit. The following method of recognizing durian ripeness by studying striking signals using N-gram models with N-best lists and majority voting is proposed. The recognition process is composed of three stages: 1) extract the acoustic features from the striking signals, 2) recognize unripe and ripe durian striking signals using the N-gram models and 3) find the ripeness from the N-best lists using majority voting. The results indicate that using the 3-best lists and majority voting method it was possible to recognize durian ripeness efficiently. Average ripeness recognition rates of 95.8%, 90.4% and 93.1% were obtained from the untrained, unknown and both test sets, respectively. The results demonstrate that the method is accurate enough to be used by consumers to help them select a ripe durian.


N-gram HMM MFCC durian ripeness durian striking signals striking sound recognition N-best lists majority voting 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Yeo, C.Y., Al-Haddad, S.A.R., Ng, C.K.: Animal Voice Recognition for Identification (ID) Detection System. In: Proceedings of the IEEE 7th International Colloquium on Signal Processing and Its Applications, pp. 198–201 (2011)Google Scholar
  2. 2.
    Mitrovic, D., Zeppelzauer, M., Breiteneder, C.: Discrimination and Retrieval of Animal Sounds. In: Proceedings of the 12th International Multi-Media Modelling Conference, pp. 339–343 (2006)Google Scholar
  3. 3.
    Guo, G., Li, Z.: Content-based Classification and Retrieval by Support Vector Machines. IEEE Transactions on Neural Networks 14, 209–215 (2003)CrossRefGoogle Scholar
  4. 4.
    Phoophuangpairoj, R., Phongsuphap, S., Tangwongsan, S.: Gender Identification from Thai Speech Signal Using a Neural Network. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009, Part I. LNCS, vol. 5863, pp. 676–684. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Ting, H., Yingchun, Y., Zhaohui, W.: Combining MFCC and Pitch to Enhance the Performance of the Gender Recognition. In: Proceedings of the 8th International Conference on Signal Processing (2006)Google Scholar
  6. 6.
    Azghadi, S.M.R., Bonyadi, M.R., Sliahhosseini, H.: Gender Classification Based on Feedforward Backpropagation Neural Network. In: Boukis, C., Pnevmatikakis, L., Polymenakos, L. (eds.) Artificial Intelligence and Innovations 2007: From Theory to Applications. IFIP, vol. 247, pp. 299–304. Springer, Boston (2007)CrossRefGoogle Scholar
  7. 7.
    James, M.H., Michael, J.C.: The Role of F0 and Formant Frequencies in Distinguishing the Voices of Men and Women. Attention, Perception, & Psychophysics 71(5), 1150–1166 (2009)CrossRefGoogle Scholar
  8. 8.
    Sigmund, M.: Gender Distinction Using Short Segments of Speech Signal. International Journal of Computer Science and Network Security 8(10), 159–162 (2008)Google Scholar
  9. 9.
    Tangwongsan, S., Po-Aramsri, P., Phoophuangpairoj, R.: Highly Efficient and Effective Techniques for Thai Syllable Speech Recognition. In: Maher, M.J. (ed.) ASIAN 2004. LNCS, vol. 3321, pp. 259–270. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  10. 10.
    Thubthong, N., Kijsirikul, B.: Tone Recognition of Continuous Thai Speech Under Tonal Assimilation and Declination Effects Using Half-tone Model, International Journal of Uncertainty. Fuzziness and Knowledge-Based Systems 9(6), 815–825 (2001)CrossRefGoogle Scholar
  11. 11.
    Lee, T., Lau, W., Wong, Y.W., Ching, P.C.: Using Tone Information in Cantonese Continuous Speech Recognition. ACM Transactions on Asian Language Information Processing (TALIP) 1(1), 83–102 (2002)CrossRefGoogle Scholar
  12. 12.
    Ververidis, D., Kotropoulos, C.: Automatic Speech Classification to Five Emotional States Based on Gender Information. In: Proceedings of the European Signal Processing Conference, vol. 1, pp. 341–344 (2004)Google Scholar
  13. 13.
    Tangwongsan, S., Phoophuangpairoj, R.: Boosting Thai Syllable Speech Recognition Using Acoustic Models Combination. In: Proceedings of the International Conference on Computer and Electrical Engineering, pp. 568–572 (2008)Google Scholar
  14. 14.
    Phoophuangpairoj, R.: Using Multiple HMM Recognizers and the Maximum Method to Improve Voice-controlled Robots. In: Proceedings of the International Conference on Intelligent Signal Processing and Communication Systems (2011)Google Scholar
  15. 15.
    Pohl, A., Ziółko, B.: Using Part of Speech N-Grams for Improving Automatic Speech Recognition of Polish. In: Perner, P. (ed.) MLDM 2013. LNCS (LNAI), vol. 7988, pp. 492–504. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  16. 16.
    Lee, A., Kawahara, T., Shikano, K.: Julius — An Open Source Real-time Large Vocabulary Recognition Engine. In: Proceedings of European Conference on Speech Communication and Technology, EUROSPEECH, pp. 1691–1694 (2001)Google Scholar
  17. 17.
    Lee, A., Kawahara, T.: Recent Development of Open-source Recognition Engine Julius. In: Proceedings of PSIPA Annual Summit and Conference (2009)Google Scholar
  18. 18.
    Deemagarn, A., Kawtrakul, A.: Thai Connected Digit Speech Recognition Using Hidden Markov Models. In: Proceedings of the 9th International Conference on Speech and Computer (2004)Google Scholar
  19. 19.
    Li, F., Ma, J., Huang, D.: MFCC and SVM Based Recognition of Chinese Vowels. In: Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y.-m., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005, Part II. LNCS (LNAI), vol. 3802, pp. 812–819. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  20. 20.
    Phoophuangpairoj, R.: Determining Guava Freshness by Flicking Signal Recognition Using HMM Acoustic Models. International Journal of Computer Theory and Engineering 5(6), 877–884 (2013)CrossRefGoogle Scholar
  21. 21.
    The Hidden Markov Model Toolkit (HTK),
  22. 22.
    The Open-Source Large Vocabulary CSR Engine Julius,

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Computer Engineering, College of EngineeringRangsit UniversityPathum ThaniThailand

Personalised recommendations