Abstract
Dengue fever detection and classification have a vital role due to the recent outbreaks of different kinds of dengue fever. Recently, the advancement in the microarray technology can be employed for such classification process. Several studies have established that the gene selection phase takes a significant role in the classifier performance. Subsequently, the current study focused on detecting two different variations, namely, dengue fever (DF) and dengue hemorrhagic fever (DHF). A modified bag-of-features method has been proposed to select the most promising genes in the classification process. Afterward, a modified cuckoo search optimization algorithm has been engaged to support the artificial neural (ANN-MCS) to classify the unknown subjects into three different classes namely, DF, DHF, and another class containing convalescent and normal cases. The proposed method has been compared with other three well-known classifiers, namely, multilayer perceptron feed-forward network (MLP-FFN), artificial neural network (ANN) trained with cuckoo search (ANN-CS), and ANN trained with PSO (ANN-PSO). Experiments have been carried out with different number of clusters for the initial bag-of-features-based feature selection phase. After obtaining the reduced dataset, the hybrid ANN-MCS model has been employed for the classification process. The results have been compared in terms of the confusion matrix-based performance measuring metrics. The experimental results indicated a highly statistically significant improvement with the proposed classifier over the traditional ANN-CS model.
This is a preview of subscription content, log in to check access.








References
- 1.
Pomeroy SL, Tamayo P, Gaasenbeek M, Sturla LM, Angelo M, McLaughlin ME, Kim JY, Goumnerova LC, Black PM, Lau C, Allen JC, Zagzag D, Olson JM, Curran T, Wetmore C, Biegel JA, Poggio T, Mukherjee S, Rifkin R, Califano A, Stolovitzky G, Louis DN, Mesirov JP, Lander ES, Golub TR (2002) Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 415:436–442
- 2.
Petricoin EF, Ardekani AM, Hitt BA, Levine PJ, Fusaro VA, Steinberg SM, Mills GB, Simone C, Fishman DA, Kohn EC, Liotta LA (2002) Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359:572–577
- 3.
Shipp MA, Ross KN, Tamayo P, Weng AP, Kutok JL, Aguiar RC, Gaasenbeek M, Angelo M, Reich M, Pinkus GS, Ray TS, Koval MA, Last KW, Norton A, Lister TA, Mesirov J, Neuberg DS, Lander ES, Aster JC, Golub TR (2002) Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nat Med 8:68–74
- 4.
David W (2003) Galbraith: global analysis of cell type-specific gene expression. Comp Funct Genom 4:208–215
- 5.
Heller RA, Schena M, Chai A, Shalon D, Bedilion T, Gilmore J, Woolley DE, Davis RW (1997) Discovery and analysis of inflammatory disease related genes using cDNA microarrays. Proc Natl Acad Sci U S A 94:2150–2155
- 6.
Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D'Amico AV, Richie JP, Lander ES, Loda M, Kantoff PW, Golub TR, Sellers WR (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1:203–209
- 7.
Dai JJ, Lieu L, Rocke D (2006) Dimension reduction for classification with gene expression microarray data. Stat Appl Genet Mol Biol 5(1):1147
- 8.
Wang H, van der Laan MJ (2011) Dimension reduction with gene expression data using targeted variable importance measurement. BMC bioinformatics 12(1):312
- 9.
Chao, S., & Lihui, C. (2004, December). High dimensional gene expression data dimension reduction. In Cybernetics and Intelligent Systems, 2004 I.E. Conference on (Vol. 1, pp. 451-455). IEEE
- 10.
Pamukçu, E., Bozdogan, H., & Çalık, S. (2015). A novel hybrid dimension reduction technique for undersized high dimensional gene expression data sets using information complexity criterion for cancer classification. Computational and mathematical methods in medicine, 2015
- 11.
Kar S, Sharma KD, Maitra M (2015) Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique. Expert Syst Appl 42(1):612–627
- 12.
Chen H, Zhang Y, Gutman I (2016) A kernel-based clustering method for gene selection with gene expression data. J Biomed Inform 62:12–20
- 13.
World Health Organization (2013) World health statistics 2013. World Health Organization, Geneva
- 14.
Bhatt S, Gething PW, Brady OJ, Messina JP, Farlow AW, Moyes CL et al (2013) The global distribution and burden of dengue. Nature 496(7446):504–507
- 15.
Azad S, Lio P (2014) Emerging trends of malaria-dengue geographical coupling in the Southeast Asia region. J Vector Borne Dis 51(3):165–171
- 16.
World Health Organization (2011). Comprehensive guideline for prevention and control of dengue and dengue haemorrhagic fever. Revised and expanded edition. New Delhi: World Health Organization. Regional Office for South-East Asia.
- 17.
Rodriguez-Roche R, Gould EA (2013) Understanding the dengue viruses and progress towards their control. Biomed Res Int 2013:690835
- 18.
Simmons CP, Farrar JJ, Chau N van V, Wills B. Dengue: current concepts. N Engl J Med 2012; 366:1423–1432
- 19.
Gao, J., Liu, N., Lawley, M., & Hu, X. (2017) An interpretable classification framework for information extraction from online healthcare forums, Journal of Healthcare Engineering Vol 2017 (2017)
- 20.
Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439):531–537
- 21.
Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci 96(12):6745–6750
- 22.
Dudoit S, Fridlyand J, Speed TP (2002) Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 97(457):77–87
- 23.
West M, Blanchette C, Dressman H, Huang E, Ishida S, Spang R et al (2001) Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Natl Acad Sci 98(20):11462–11467
- 24.
Nguyen DV, Rocke DM (2002) Tumor classification by partial least squares using microarray gene expression data. Bioinformatics 18(1):39–50
- 25.
Antoniadis A, Lambert-Lacroix S, Leblanc F (2003) Effective dimension reduction methods for tumor classification using gene expression data. Bioinformatics 19(5):563–570
- 26.
Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F et al (2001) Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks. Nat Med 7(6):673–679
- 27.
O'Neill MC, Song L (2003) Neural network analysis of lymphoma microarray data: prognosis and diagnosis near-perfect. BMC bioinformatics 4(1):13
- 28.
Liu B, Cui Q, Jiang T, Ma S (2004) A combinational feature selection and ensemble neural network method for classification of gene expression data. BMC bioinformatics 5(1):136
- 29.
Zhou Y, Li B, Zhang Y, Chen L, Kong X (2016) Feature classification and analysis of lung cancer related genes through gene ontology and KEGG pathways. Curr Bioinforma 11(1):40–50
- 30.
Passalis N, Tefas A (2017) Neural bag-of-features learning. Pattern Recogn 64:277–294
- 31.
Grzeszick R, Plinge A, Fink GA (2017) Bag-of-features methods for acoustic event detection and classification. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25(6):1242–1252
- 32.
Chatterjee S, Sarkar S, Hore S, Dey N, Ashour AS, Balas VE (2016) Particle swarm optimization trained neural network for structural failure prediction of multistoried RC buildings. Neural Comput & Applic:1–12
- 33.
Chatterjee S, Dey N, Ashour AS, Drugarin CVA (2017) Electrical energy output prediction using cuckoo search supported artificial neural network. World Conference on Smart Trends in Systems, Security and Sustainability (WS4 2017) At London, Volume. Springer LNNS series, Berlin (In press)
- 34.
Sirshendu Hore, Sankhadeep Chatterjee, Sarbartha Sarkar, Nilanjan Dey, Amira S. Ashour, Dana Balas Timar, Valentina E Balas. Neural-based prediction of structural failure of multi storied RC buildings. Structural Engineering and Mechanics, Vol 58, No 3, May, 2016, pp. 459–473
- 35.
Møller MF (1993) A scaled conjugate gradient algorithm for fast supervised learning. Neural Netw 6(4):525–533
- 36.
Sankhadeep Chatterjee, Sarbartha Sarkar, Nilanjan Dey, Soumya Sen, Takaaki Goto Narayan C Debnath, Water quality prediction: multi objective genetic algorithm coupled artificial neural network based approach. IEEE 15th International Conference of Industrial Informatics INDIN'2017, Emden, Germany, July 2017. (in press)
- 37.
Chakraborty S, Chatterjee S, Dey N, Ashour AS, Ashour A, Shi F, Mali K (May 2017) Modified cuckoo search algorithm in microscopic image segmentation of hippocampus. Wiley, Microscopy Research and Technique
- 38.
Zhao B, Zhong Y, Zhang L (2016) A spectral–structural bag-of-features scene classifier for very high spatial resolution remote sensing imagery. ISPRS J Photogramm Remote Sens 116:73–85
- 39.
Pérez DS, Bromberg F, Diaz CA (2017) Image classification for detection of winter grapevine buds in natural conditions using scale-invariant features transform, bag of features and support vector machines. Comput Electron Agric 135:81–95
- 40.
Chatterjee, S., Ghosh, S., Dawn, S., Hore, S., & Dey, N. (2016). Forest type classification: a hybrid NN-GA model based approach. In: Information systems design and intelligent applications (pp. 227-236). Springer India
- 41.
Chatterjee S, Sarkar S, Hore S, Dey N, Ashour AS, Shi F, Le D-N (2017) Structural failure classification for reinforced concrete buildings using trained neural network based multi- objective genetic algorithm. Techno Press, Structural Engineering and Mechanics (in press)
- 42.
Sankhadeep Chatterjee, Sirshendu Hore, Nilanjan Dey, Sayan Chakraborty, Amira S. Ashour, Dengue fever classification using gene expression data: a PSO based artificial neural network approach. 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications; Springer, June 2016
- 43.
Liu, H. C., Yih, J. M., & Liu, S. W. (2007). Fuzzy c-mean algorithm based on Mahalanobis distances and better initial values. In: Information sciences 2007 (pp. 1398-1404)
- 44.
Wang F, He X-S, Wang Y, Yang S-M (2012a) Markov model and convergence analysis based on cuckoo search algorithm. Jisuanji Gongcheng/Comput Eng 38(11):180–185
- 45.
Wilcoxon F, Katti SK, Wilcox RA (1970) Critical values and probability levels for the Wilcoxon rank sum test and the Wilcoxon signed rank test. Selected tables in mathematical. Statistics 1:171–259
Author information
Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chatterjee, S., Dey, N., Shi, F. et al. Clinical application of modified bag-of-features coupled with hybrid neural-based classifier in dengue fever classification using gene expression data. Med Biol Eng Comput 56, 709–720 (2018). https://doi.org/10.1007/s11517-017-1722-y
Received:
Accepted:
Published:
Issue Date:
Keywords
- Dengue fever
- Bag-of-features
- Modified cuckoo search
- Artificial neural networks
- Gene expression data
- Incremental feature selection scheme