Clinical application of modified bag-of-features coupled with hybrid neural-based classifier in dengue fever classification using gene expression data


Dengue fever detection and classification have a vital role due to the recent outbreaks of different kinds of dengue fever. Recently, the advancement in the microarray technology can be employed for such classification process. Several studies have established that the gene selection phase takes a significant role in the classifier performance. Subsequently, the current study focused on detecting two different variations, namely, dengue fever (DF) and dengue hemorrhagic fever (DHF). A modified bag-of-features method has been proposed to select the most promising genes in the classification process. Afterward, a modified cuckoo search optimization algorithm has been engaged to support the artificial neural (ANN-MCS) to classify the unknown subjects into three different classes namely, DF, DHF, and another class containing convalescent and normal cases. The proposed method has been compared with other three well-known classifiers, namely, multilayer perceptron feed-forward network (MLP-FFN), artificial neural network (ANN) trained with cuckoo search (ANN-CS), and ANN trained with PSO (ANN-PSO). Experiments have been carried out with different number of clusters for the initial bag-of-features-based feature selection phase. After obtaining the reduced dataset, the hybrid ANN-MCS model has been employed for the classification process. The results have been compared in terms of the confusion matrix-based performance measuring metrics. The experimental results indicated a highly statistically significant improvement with the proposed classifier over the traditional ANN-CS model.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8


  1. 1.

    Pomeroy SL, Tamayo P, Gaasenbeek M, Sturla LM, Angelo M, McLaughlin ME, Kim JY, Goumnerova LC, Black PM, Lau C, Allen JC, Zagzag D, Olson JM, Curran T, Wetmore C, Biegel JA, Poggio T, Mukherjee S, Rifkin R, Califano A, Stolovitzky G, Louis DN, Mesirov JP, Lander ES, Golub TR (2002) Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415:436–442

    CAS  Article  PubMed  Google Scholar 

  2. 2.

    Petricoin EF, Ardekani AM, Hitt BA, Levine PJ, Fusaro VA, Steinberg SM, Mills GB, Simone C, Fishman DA, Kohn EC, Liotta LA (2002) Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359:572–577

    CAS  Article  PubMed  Google Scholar 

  3. 3.

    Shipp MA, Ross KN, Tamayo P, Weng AP, Kutok JL, Aguiar RC, Gaasenbeek M, Angelo M, Reich M, Pinkus GS, Ray TS, Koval MA, Last KW, Norton A, Lister TA, Mesirov J, Neuberg DS, Lander ES, Aster JC, Golub TR (2002) Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 8:68–74

    CAS  Article  PubMed  Google Scholar 

  4. 4.

    David W (2003) Galbraith: global analysis of cell type-specific gene expression. Comp Funct Genom 4:208–215

    Article  Google Scholar 

  5. 5.

    Heller RA, Schena M, Chai A, Shalon D, Bedilion T, Gilmore J, Woolley DE, Davis RW (1997) Discovery and analysis of inflammatory disease related genes using cDNA microarrays. Proc Natl Acad Sci U S A 94:2150–2155

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  6. 6.

    Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D'Amico AV, Richie JP, Lander ES, Loda M, Kantoff PW, Golub TR, Sellers WR (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1:203–209

    CAS  Article  PubMed  Google Scholar 

  7. 7.

    Dai JJ, Lieu L, Rocke D (2006) Dimension reduction for classification with gene expression microarray data. Stat Appl Genet Mol Biol 5(1):1147

    Article  Google Scholar 

  8. 8.

    Wang H, van der Laan MJ (2011) Dimension reduction with gene expression data using targeted variable importance measurement. BMC bioinformatics 12(1):312

    Article  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Chao, S., & Lihui, C. (2004, December). High dimensional gene expression data dimension reduction. In Cybernetics and Intelligent Systems, 2004 I.E. Conference on (Vol. 1, pp. 451-455). IEEE

  10. 10.

    Pamukçu, E., Bozdogan, H., & Çalık, S. (2015). A novel hybrid dimension reduction technique for undersized high dimensional gene expression data sets using information complexity criterion for cancer classification. Computational and mathematical methods in medicine, 2015

  11. 11.

    Kar S, Sharma KD, Maitra M (2015) Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique. Expert Syst Appl 42(1):612–627

    Article  Google Scholar 

  12. 12.

    Chen H, Zhang Y, Gutman I (2016) A kernel-based clustering method for gene selection with gene expression data. J Biomed Inform 62:12–20

    Article  PubMed  Google Scholar 

  13. 13.

    World Health Organization (2013) World health statistics 2013. World Health Organization, Geneva

    Google Scholar 

  14. 14.

    Bhatt S, Gething PW, Brady OJ, Messina JP, Farlow AW, Moyes CL et al (2013) The global distribution and burden of dengue. Nature 496(7446):504–507

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Azad S, Lio P (2014) Emerging trends of malaria-dengue geographical coupling in the Southeast Asia region. J Vector Borne Dis 51(3):165–171

    PubMed  Google Scholar 

  16. 16.

    World Health Organization (2011). Comprehensive guideline for prevention and control of dengue and dengue haemorrhagic fever. Revised and expanded edition. New Delhi: World Health Organization. Regional Office for South-East Asia.

  17. 17.

    Rodriguez-Roche R, Gould EA (2013) Understanding the dengue viruses and progress towards their control. Biomed Res Int 2013:690835

    Article  PubMed  PubMed Central  Google Scholar 

  18. 18.

    Simmons CP, Farrar JJ, Chau N van V, Wills B. Dengue: current concepts. N Engl J Med 2012; 366:1423–1432

  19. 19.

    Gao, J., Liu, N., Lawley, M., & Hu, X. (2017) An interpretable classification framework for information extraction from online healthcare forums, Journal of Healthcare Engineering Vol 2017 (2017)

  20. 20.

    Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537

    CAS  Article  PubMed  Google Scholar 

  21. 21.

    Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci 96(12):6745–6750

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  22. 22.

    Dudoit S, Fridlyand J, Speed TP (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97(457):77–87

    CAS  Article  Google Scholar 

  23. 23.

    West M, Blanchette C, Dressman H, Huang E, Ishida S, Spang R et al (2001) Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Natl Acad Sci 98(20):11462–11467

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Nguyen DV, Rocke DM (2002) Tumor classification by partial least squares using microarray gene expression data. Bioinformatics 18(1):39–50

    CAS  Article  PubMed  Google Scholar 

  25. 25.

    Antoniadis A, Lambert-Lacroix S, Leblanc F (2003) Effective dimension reduction methods for tumor classification using gene expression data. Bioinformatics 19(5):563–570

    CAS  Article  PubMed  Google Scholar 

  26. 26.

    Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F et al (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7(6):673–679

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  27. 27.

    O'Neill MC, Song L (2003) Neural network analysis of lymphoma microarray data: prognosis and diagnosis near-perfect. BMC bioinformatics 4(1):13

    Article  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Liu B, Cui Q, Jiang T, Ma S (2004) A combinational feature selection and ensemble neural network method for classification of gene expression data. BMC bioinformatics 5(1):136

    Article  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Zhou Y, Li B, Zhang Y, Chen L, Kong X (2016) Feature classification and analysis of lung cancer related genes through gene ontology and KEGG pathways. Curr Bioinforma 11(1):40–50

    CAS  Article  Google Scholar 

  30. 30.

    Passalis N, Tefas A (2017) Neural bag-of-features learning. Pattern Recogn 64:277–294

    Article  Google Scholar 

  31. 31.

    Grzeszick R, Plinge A, Fink GA (2017) Bag-of-features methods for acoustic event detection and classification. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25(6):1242–1252

    Article  Google Scholar 

  32. 32.

    Chatterjee S, Sarkar S, Hore S, Dey N, Ashour AS, Balas VE (2016) Particle swarm optimization trained neural network for structural failure prediction of multistoried RC buildings. Neural Comput & Applic:1–12

  33. 33.

    Chatterjee S, Dey N, Ashour AS, Drugarin CVA (2017) Electrical energy output prediction using cuckoo search supported artificial neural network. World Conference on Smart Trends in Systems, Security and Sustainability (WS4 2017) At London, Volume. Springer LNNS series, Berlin (In press)

    Google Scholar 

  34. 34.

    Sirshendu Hore, Sankhadeep Chatterjee, Sarbartha Sarkar, Nilanjan Dey, Amira S. Ashour, Dana Balas Timar, Valentina E Balas. Neural-based prediction of structural failure of multi storied RC buildings. Structural Engineering and Mechanics, Vol 58, No 3, May, 2016, pp. 459–473

  35. 35.

    Møller MF (1993) A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw 6(4):525–533

    Article  Google Scholar 

  36. 36.

    Sankhadeep Chatterjee, Sarbartha Sarkar, Nilanjan Dey, Soumya Sen, Takaaki Goto Narayan C Debnath, Water quality prediction: multi objective genetic algorithm coupled artificial neural network based approach. IEEE 15th International Conference of Industrial Informatics INDIN'2017, Emden, Germany, July 2017. (in press)

  37. 37.

    Chakraborty S, Chatterjee S, Dey N, Ashour AS, Ashour A, Shi F, Mali K (May 2017) Modified cuckoo search algorithm in microscopic image segmentation of hippocampus. Wiley, Microscopy Research and Technique

    Google Scholar 

  38. 38.

    Zhao B, Zhong Y, Zhang L (2016) A spectral–structural bag-of-features scene classifier for very high spatial resolution remote sensing imagery. ISPRS J Photogramm Remote Sens 116:73–85

    Article  Google Scholar 

  39. 39.

    Pérez DS, Bromberg F, Diaz CA (2017) Image classification for detection of winter grapevine buds in natural conditions using scale-invariant features transform, bag of features and support vector machines. Comput Electron Agric 135:81–95

    Article  Google Scholar 

  40. 40.

    Chatterjee, S., Ghosh, S., Dawn, S., Hore, S., & Dey, N. (2016). Forest type classification: a hybrid NN-GA model based approach. In: Information systems design and intelligent applications (pp. 227-236). Springer India

  41. 41.

    Chatterjee S, Sarkar S, Hore S, Dey N, Ashour AS, Shi F, Le D-N (2017) Structural failure classification for reinforced concrete buildings using trained neural network based multi- objective genetic algorithm. Techno Press, Structural Engineering and Mechanics (in press)

    Google Scholar 

  42. 42.

    Sankhadeep Chatterjee, Sirshendu Hore, Nilanjan Dey, Sayan Chakraborty, Amira S. Ashour, Dengue fever classification using gene expression data: a PSO based artificial neural network approach. 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications; Springer, June 2016

  43. 43.

    Liu, H. C., Yih, J. M., & Liu, S. W. (2007). Fuzzy c-mean algorithm based on Mahalanobis distances and better initial values. In: Information sciences 2007 (pp. 1398-1404)

  44. 44.

    Wang F, He X-S, Wang Y, Yang S-M (2012a) Markov model and convergence analysis based on cuckoo search algorithm. Jisuanji Gongcheng/Comput Eng 38(11):180–185

    Google Scholar 

  45. 45.

    Wilcoxon F, Katti SK, Wilcox RA (1970) Critical values and probability levels for the Wilcoxon rank sum test and the Wilcoxon signed rank test. Selected tables in mathematical. Statistics 1:171–259

    Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Sankhadeep Chatterjee.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Chatterjee, S., Dey, N., Shi, F. et al. Clinical application of modified bag-of-features coupled with hybrid neural-based classifier in dengue fever classification using gene expression data. Med Biol Eng Comput 56, 709–720 (2018).

Download citation


  • Dengue fever
  • Bag-of-features
  • Modified cuckoo search
  • Artificial neural networks
  • Gene expression data
  • Incremental feature selection scheme