Self-organizing mapping based swarm intelligence for secondary and tertiary proteins classification

  • Md. Sarwar Kamal
  • Md. Golam Sarowar
  • Nilanjan Dey
  • Amira S. AshourEmail author
  • Shamim H. Ripon
  • B. K. Panigrahi
  • João Manuel R. S. Tavares
Original Article


Proteins have a significant role in animals and human health. Interactions among proteins are complex and large. Proteins separations are challenging process in molecular biology. Computational tools help to simulate the analysis in order to reduce the training data into small testing data. Large proteins have been mapped using self-organizing maps (SOMs). Neural network based SOMs has a significant role in reducing the irregular shapes of proteins interactions. Iterative checking enables the organizations of all proteins. In next stage, particle swarm intelligence is applied to classify the proteins’ families. In the current work, secondary (Two dimensional) and tertiary proteins (Three dimensional) proteins have been grouped. Two dimensional proteins contain fewer hydro-carbons than three dimensional proteins. For faster analysis, the angles of the proteins are taken into account. The SOMs is compared with Bounding Box approach. In final, the experimental evolutions show that swarm intelligence achieved faster processing through enabling less memory consumptions and time. Since PSO combines proteins datasets in fuzzy values, the compactness or integration of similar proteins are strong. On the other hand, Bounding Box uses the Crisp value. Therefore, it needs more space to organize the whole data. Without SOMs, swarm intelligence also results are poor due to the excessive time consuming and required storage area. Moreover, for almost all classification and clustering tools, it is observed that the overall classification task becomes slow, time consuming, space consuming and also less sensitive because of noises, irrelevant data in input datasets. Thus, the proposed SOM based PSO approach achieved less time consuming with efficient classification into secondary and tertiary proteins.


Proteins Self-organizing map Swarm intelligence Bounding box Tertiary proteins 


  1. 1.
    Turcu A, Palmieri R, Ravindran B, Hirve S (2016) Automated data partitioning for highly scalable and strongly consistent transactions. IEEE Trans Parallel Distrib Syst 27(1):106–118Google Scholar
  2. 2.
    Chien JT, KuBayesian YC (2016) Recurrent neural network for language modeling. IEEE Trans Neural Netw Learn Syst 27(2):361–374MathSciNetGoogle Scholar
  3. 3.
    Deng SP, Zhu L, Huang DS (2016) Predicting hub genes associated with cervical cancer through gene co-expression networks. IEEE/ACM Trans Comput Biol Bioinform 13(1):27–35Google Scholar
  4. 4.
    Hsieh SY, Chou YC (2016) A Faster cDNA microarray gene expression data classifier for diagnosing diseases. IEEE/ACM Trans Comput Biol Bioinform 13(1):43–54Google Scholar
  5. 5.
    Dhulekar N, Ray S, Yuan D, Baskaran A, Oztan B, Larsen M, Yene B (2016) Prediction of growth factor-dependent cleft formation during branching morphogenesis using a dynamic graph-based growth model. IEEE/ACM Trans Comput Biol Bioinform 13(2):350–363Google Scholar
  6. 6.
    Sáez JA, Luengo J, Herrera F (2016) Evaluating the classifier behavior with noisy data considering performance and robustness: the equalized loss of accuracy measure. Neurocomputing 176:26–35Google Scholar
  7. 7.
    Saez JA, Galar M, Luengo J, Herrera F (2016) INFFC: an iterative class noise filter based on the fusion of classifiers with noise sensitivity control. Inf Fusion 27:505–636Google Scholar
  8. 8.
    Fdez JA, Alonso JM (2016) A survey of fuzzy systems software: taxonomy, current research trends and prospects. IEEE Trans Fuzzy Syst 24(1):40–56Google Scholar
  9. 9.
    Palacios A, Sanchez L, Couso I (2016) An extension of the FURIA classification algorithm to low quality data through fuzzy rankings and its application to the early diagnosis of dyslexia. Neurocomputing 176:60–71Google Scholar
  10. 10.
    González M, Bergmeir C, Triguero I, Rodríguez Y, Benítez JM (2016) On the stopping criteria for k-nearest neighbor in positive unlabeled time series classification problems. Inf Sci 328:42–59zbMATHGoogle Scholar
  11. 11.
    Martin D, Fdez JA, Rosete A, Herrera F (2016) NICGAR: a niching genetic algorithm to mine a diverse set of interesting quantitative association rules. Inf Sci 355–356:208–228Google Scholar
  12. 12.
    Butt AH, Khan SA, Jamil H, Rasool N, Khan YD (2016) A prediction model for membrane proteins using moments based features. Biomed Res Int 2016:8370132. doi: 10.1155/2016/8370132 Google Scholar
  13. 13.
    Vala MHJ, Baxi A (2013) A review on otsu image segmentation algorithm. Int J Adv Res Comput Eng Technol 2(2):387–389 (ISSN: 2278–1323) Google Scholar
  14. 14.
    Akbal-Delibas B, Farhoodi R, Pomplun M, Haspel N (2016) Accurate refinement of docked protein complexes using evolutionary information and deep learning. J Bioinform Comput Biol 14(3):1642002. doi: 10.1142/S0219720016420026 Google Scholar
  15. 15.
    Wang B, Wang M, Jiang Y, Sun D, Xu X (2015) A novel network-based computational method to predict protein phosphorylation on tyrosine sites. J Bioinform Comput Biol 13:1542005. doi: 10.1142/S0219720015420056 Google Scholar
  16. 16.
    Wang D, Hou J (2015) Explore the hidden treasure in protein–protein interaction networks—an iterative model for predicting protein functions. J Bioinform Comput Biol 13(5):1550026. doi: 10.1142/S0219720015500262 Google Scholar
  17. 17.
    Watson JD, Laskowski RA, Thornton JM (2005) Predicting protein function from sequence and structural data. Curr Opin Struct Biol 15(3):275–284Google Scholar
  18. 18.
    Tan S, Guan Z, Cai D, Qin X, Bu J, Chen C (2014) Mapping users across networks by manifold alignment on hypergraph. In Proceedings of the twenty-eighth AAAI conference on artificial intelligence (AAAI’14), 159–165Google Scholar
  19. 19.
    Bangyal W, Jamil A, Shafi I, Abbas Q (2011) propagation network-based approach for contraceptive method choice classification task. J Exp Theor Artif Intell 24(2):211–218Google Scholar
  20. 20.
    Brereton RG, Lloyda GR (2010) Support vector machines for classification and regression. Analyst. doi: 10.1039/B918972F Google Scholar
  21. 21.
    Iranmanesh A, Fahimi M (2001) Genetic algorithm trained counter-propagation neural net in structural optimization. In: Proceedings of the sixth international conference on Application of artificial intelligence to civil and structural engineering (ICAAICSE ‘01), Topping BHV, Kumar B (Eds.). Civil-Comp Press, pp. 85-86Google Scholar
  22. 22.
    Bollen J, Van de Sompel H, Hagberg A, Chute R (2009) A principal component analysis of 39 scientific impact measures. PLoS One 4(6):e6022. doi: 10.1371/journal.pone.0006022 Google Scholar
  23. 23.
    MacQueen JB (1967) “Some methods for classification and analysis of multivariate observations. In: Proceedings of 5-th Berkeley symposium on mathematical statistics and probability”. Berkeley, University of California Press, 1:281–297Google Scholar
  24. 24.
    Yuan X, Martínez J-F, Eckert M, López-Santidrián L (2016) An improved Otsu threshold segmentation method for underwater simultaneous localization and mapping-based navigation. Sensors 16(7):1148. doi: 10.3390/s16071148 Google Scholar
  25. 25.
    Xu ZB, Chen PJ, Yan SL, Wang TH (2014) Study on Otsu threshold method for image segmentation based on genetic algorithm. Adv Mater Res 999:925–928Google Scholar
  26. 26.
    Hegde GP, Seetha M, Hegde N (2016) Kernel locality preserving symmetrical weighted fisher discriminant analysis based subspace approach for expression recognition. Int J Eng Sci Technol 19(3):1321–1333. doi: 10.1016/j.jestch.2016.03.005 Google Scholar
  27. 27.
    Taormina R, Chau KW (2015) Data-driven input variable selection for rainfall–runoff modeling using binary-coded particle swarm optimization and extreme learning machines. J Hydrol 529:1617–1632Google Scholar
  28. 28.
    Pedruzzi I, Rivoire C, Auchincloss AH et al (2013) HAMAP in 2013, new developments in the protein family classification and annotation system. Nucleic Acids Res 41(D1):D584–D589. doi: 10.1093/nar/gks1157 Google Scholar
  29. 29.
    Maddouri RSM, Nguifo EM (2010) Protein sequences classification by means of feature extraction with substitution matrices. BMC Bioinform 11:175Google Scholar
  30. 30.
    Bernardes JS, Fernandez JH, Vasconcelos ATR (2008) Structural descriptor database: a new tool for sequence-based functional site prediction. BMC Bioinform 9:492Google Scholar
  31. 31.
    Yan R-X, Si J-N, Wang C, Zhang Z (2009) DescFold: a web server for protein fold recognition. BMC Bioinform 10:416Google Scholar
  32. 32.
    Rost B, Liu J, Nair R, Wrzeszczynski KO, Ofran Y (2003) Automatic prediction of protein function. Cell Mol Life Sci. 60(12):2637–2650Google Scholar
  33. 33.
    Baugh EH, Simmons-Edler R, Müller CL, Alford RF, Volfovsky N, Lash AE, Bonneau R (2016) Robust classification of protein variation using structural modelling and large-scale data integration. Oxf J Sci Math Nucleic Acids Res 44(6):2501–2513Google Scholar
  34. 34.
    Dinubhai PM, Shah HB (2013) Comparative study of multi-class protein structure prediction using advanced soft computing techniques. Int J Eng Sci Innov Technol 2(2):275–282Google Scholar
  35. 35.
    Burkhardt K, Schneider B, Ory J (2006) A biocurator perspective: annotation at the research collaboratory for structural bioinformatics protein data bank. PLoS Comput Biol 2(10):e99. doi: 10.1371/journal.pcbi.0020099 Google Scholar
  36. 36.
    Li YH, Xu JY, Tao L, Li XF, Li S et al (2016) SVM-Prot 2016: a web-server for machine learning prediction of protein functional families from sequence irrespective of similarity. PLos One 11(8):e0155290. doi: 10.1371/journal.pone.0155290 Google Scholar
  37. 37.
    Cai Y-D, Liu X-J, Xu X-B, Zhou G-P (2001) Support vector machines for predicting protein structural class. BMC Bioinform 2:3Google Scholar
  38. 38.
    Selvaraj MK, Puri M, Dikshit KL, Lefevre C (2016) BacHbpred: support vector machine methods for the prediction of bacterial hemoglobin-like proteins. Adv Bioinform 2016:8150784. doi: 10.1155/2016/8150784 Google Scholar
  39. 39.
    Dhifli W, Diallo AB (2016) ProtNN: fast and accurate nearest neighbor protein function prediction based on graph embedding in structural and topological space, Cornell University, pp 1–28Google Scholar
  40. 40.
    Krissinel E, Henrick K (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr Sect D Biol Crystallogr 60(12):2256–2268Google Scholar
  41. 41.
    Bhattacharya S, Bhattacharyya C, Chandra NR (2007) Comparison of protein structures by growing neighborhood alignments. BMC Bioinform 8:77. doi: 10.1186/1471-2105-8-77 Google Scholar
  42. 42.
    Nandanwar S, Murty MN Structural neighborhood based classification of nodes in a network. In: Proceeding, KDD ‘16 Proceedings of the 22nd ACM SIGKDD international conference on knowledge, discovery and data mining, pp. 1085–1094, ACM New York, NY, USAGoogle Scholar
  43. 43.
    Bhatia N, Vandana SSCS (2010) Survey of nearest neighbor techniques. Int J Comput Sci Inf Secur 8:302–305Google Scholar
  44. 44.
    Desrosiers C, Karypis G (2010) A comprehensive survey of neighborhood-based recommendation methods. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, Boston, pp 107–144. doi: 10.1007/978-0-387-85820-3_4
  45. 45.
    Hadley C, Jones DT (1999) A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. Structure 7(9):1099–1112Google Scholar
  46. 46.
    Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540Google Scholar
  47. 47.
    Hore S, Chatterjee S, Sarkar S, Dey N, Ashour AS, Balas-Timar D, Balas VE (2016) Neural-based prediction of structural failure of multistoried RC buildings. Struct Eng Mech 58(3):459–473Google Scholar
  48. 48.
    Zhang J, Chau KW (2009) Multilayer ensemble pruning via novel multi-sub-swarm particle swarm optimization. J UCS 15(4):840–858Google Scholar
  49. 49.
    Sharma K, Virmani J (2017) A decision support system for classification of normal and medical renal disease using ultrasound images: a decision support system for medical renal diseases. Int J Ambient Comput Intell 8(2):52–69Google Scholar
  50. 50.
    Wang WC, Chau KW, Xu DM, Chen XY (2015) Improving forecasting accuracy of annual runoff time series using ARIMA based on EEMD decomposition. Water Resour Manag 29(8):2655–2675Google Scholar
  51. 51.
    Li Z, Shi K, Dey N, Ashour AS, Wang D, Balas VE et al (2017) Rule-based back propagation neural networks for various precision rough set presented KANSEI knowledge prediction: a case study on shoe product form features extraction. Neural Comput Appl 28(3):613–630Google Scholar
  52. 52.
    Manogaran G, Lopez D (2017) Disease surveillance system for big climate data processing and dengue transmission. Int J Ambient Comput Intell 8(2):88–105Google Scholar
  53. 53.
    Zhang S, Chau KW (2009) Dimension reduction using semi-supervised locally linear embedding for plant leaf classification. In: International conference on intelligent computing. Springer, Berlin, pp 948–955. doi: 10.1007/978-3-642-04070-2_100
  54. 54.
    Wu CL, Chau KW, Li YS (2009) Methods to improve neural network performance in daily flows prediction. J Hydrol 372(1):80–93Google Scholar
  55. 55.
    Chau KW, Wu CL (2010) A hybrid model coupled with singular spectrum analysis for daily rainfall prediction. J Hydroinform 12(4):458–473Google Scholar
  56. 56.
    Wang XZ, He YL, Dong LC, Zhao HY (2011) Particle swarm optimization for determining fuzzy measures from data. Inf Sci 181(19):4230–4252zbMATHGoogle Scholar
  57. 57.
    Wang XZ, Xing HJ, Li Y, Hua Q, Dong CR, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654Google Scholar
  58. 58.
    Nimmy SF, Kamal MS (2015) Next generation sequencing under De-Novo genome assembly. Int Journal of Biomath 8(5):1–29zbMATHGoogle Scholar
  59. 59.
    Kamal MS, Khan MI (2014) performance evaluation of Warshall algorithm and dynamic programming for markov chain in local sequence alignment. Interdiscip Sci Comput Life Sci 7(1):78–81Google Scholar
  60. 60.
    Kamal MS, Khan MI (2014) An integrated algorithm for local sequence alignment. Netw Model Anal Health Inform Bioinforma 3:1–9. doi: 10.1007/s13721-014-0068-8 Google Scholar
  61. 61.
    Chatterjee S, Hore S, Dey N, Chakraborty S, Ashour AS (2016) Dengue fever classification using gene expression data: a PSO based artificial neural network approach. In: 5th International conference on frontiers in intelligent computing: theory and applications, volume: Springer AISCGoogle Scholar
  62. 62.
    Wang D, He T, Li Z, Cao L, Dey N, Ashour AS, Balas VE, McCauley P, Lin Y, Xu J, Shi F (2016) Image feature-based affective retrieval employing improved parameter and structure identification of adaptive neuro-fuzzy inference system. Neural Comput Appl. doi: 10.1007/s00521-016-2512-4 Google Scholar
  63. 63.
    Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M (2016) O. J. NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res 44(14):6614–6624Google Scholar
  64. 64.
    Tateno Y, Miyazaki S, Ota M, Sugawara H, Gojobori T (2000) DNA Data Bank of Japan (DDBJ) in collaboration with mass sequencing teams. Nucleic Acids Res 28:24–26 (Updated article in this issue: Nucleic Acids Res. (2002), 30, 27–30) Google Scholar
  65. 65.
    Benson DA, K-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL (2000) GenBank. Nucleic Acids Res 28:15–18Google Scholar
  66. 66.
    Schmuker M, Schwarte F, Brück A, Proschak E, Tanrikulu Y, Givehchi A, Scheiffele K, Schneider G (2007) SOMMER: self-organising maps for education and research. J Mol Model 13:225–228Google Scholar
  67. 67.
    Faigl J (2016) An application of self-organizing map for multirobot multigoal path planning with minmax objective. Comput Intell Neurosci 2016:2720630. doi: 10.1155/2016/2720630 Google Scholar
  68. 68.
    Muñoz A, Muruzábal J (1998) Self-organizing maps for outlier detection. Neurocomputing 18(1):33–60. doi: 10.1016/S0925-2312(97)00068-4 Google Scholar
  69. 69.
    Rini DP, Shamsuddin SM, Yuhaniz SS (2011) Particle swarm optimization: technique, system and challenges. Int J Comput Appl 14(1):19–27Google Scholar
  70. 70.
    Hu X, Shi Y, Eberhart R (2004) Recent advances in particle swarm. Evol Comput 1:90–97 (CEC2004) Google Scholar
  71. 71.
    Kohonen T (1995) Self-organizing maps. Springer, New YorkzbMATHGoogle Scholar
  72. 72.
    Bai Q (2010) Analysis of particle swarm optimization algorithm. Comput Inf Sci 3(1). doi: 10.5539/cis.v3n1p180 Google Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.East West University BangladeshDhakaBangladesh
  2. 2.Department of Information TechnologyTechno India College of TechnologyKolkataIndia
  3. 3.Department of Electronics and Electrical Communications Engineering, Faculty of EngineeringTanta UniversityTantaEgypt
  4. 4.Department of Electrical EngineeringIndian Institute of TechnologyDelhiIndia
  5. 5.Departamento de Engenharia Mecânica, Instituto de Ciência e Inovação em Engenharia Mecânica e Engenharia Industrial, Faculdade de EngenhariaUniversidade do PortoPortoPortugal

Personalised recommendations