Skip to main content
Log in

Self-organizing mapping based swarm intelligence for secondary and tertiary proteins classification

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Proteins have a significant role in animals and human health. Interactions among proteins are complex and large. Proteins separations are challenging process in molecular biology. Computational tools help to simulate the analysis in order to reduce the training data into small testing data. Large proteins have been mapped using self-organizing maps (SOMs). Neural network based SOMs has a significant role in reducing the irregular shapes of proteins interactions. Iterative checking enables the organizations of all proteins. In next stage, particle swarm intelligence is applied to classify the proteins’ families. In the current work, secondary (Two dimensional) and tertiary proteins (Three dimensional) proteins have been grouped. Two dimensional proteins contain fewer hydro-carbons than three dimensional proteins. For faster analysis, the angles of the proteins are taken into account. The SOMs is compared with Bounding Box approach. In final, the experimental evolutions show that swarm intelligence achieved faster processing through enabling less memory consumptions and time. Since PSO combines proteins datasets in fuzzy values, the compactness or integration of similar proteins are strong. On the other hand, Bounding Box uses the Crisp value. Therefore, it needs more space to organize the whole data. Without SOMs, swarm intelligence also results are poor due to the excessive time consuming and required storage area. Moreover, for almost all classification and clustering tools, it is observed that the overall classification task becomes slow, time consuming, space consuming and also less sensitive because of noises, irrelevant data in input datasets. Thus, the proposed SOM based PSO approach achieved less time consuming with efficient classification into secondary and tertiary proteins.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Turcu A, Palmieri R, Ravindran B, Hirve S (2016) Automated data partitioning for highly scalable and strongly consistent transactions. IEEE Trans Parallel Distrib Syst 27(1):106–118

    Article  Google Scholar 

  2. Chien JT, KuBayesian YC (2016) Recurrent neural network for language modeling. IEEE Trans Neural Netw Learn Syst 27(2):361–374

    Article  MathSciNet  Google Scholar 

  3. Deng SP, Zhu L, Huang DS (2016) Predicting hub genes associated with cervical cancer through gene co-expression networks. IEEE/ACM Trans Comput Biol Bioinform 13(1):27–35

    Article  Google Scholar 

  4. Hsieh SY, Chou YC (2016) A Faster cDNA microarray gene expression data classifier for diagnosing diseases. IEEE/ACM Trans Comput Biol Bioinform 13(1):43–54

    Article  Google Scholar 

  5. Dhulekar N, Ray S, Yuan D, Baskaran A, Oztan B, Larsen M, Yene B (2016) Prediction of growth factor-dependent cleft formation during branching morphogenesis using a dynamic graph-based growth model. IEEE/ACM Trans Comput Biol Bioinform 13(2):350–363

    Article  Google Scholar 

  6. Sáez JA, Luengo J, Herrera F (2016) Evaluating the classifier behavior with noisy data considering performance and robustness: the equalized loss of accuracy measure. Neurocomputing 176:26–35

    Article  Google Scholar 

  7. Saez JA, Galar M, Luengo J, Herrera F (2016) INFFC: an iterative class noise filter based on the fusion of classifiers with noise sensitivity control. Inf Fusion 27:505–636

    Article  Google Scholar 

  8. Fdez JA, Alonso JM (2016) A survey of fuzzy systems software: taxonomy, current research trends and prospects. IEEE Trans Fuzzy Syst 24(1):40–56

    Article  Google Scholar 

  9. Palacios A, Sanchez L, Couso I (2016) An extension of the FURIA classification algorithm to low quality data through fuzzy rankings and its application to the early diagnosis of dyslexia. Neurocomputing 176:60–71

    Article  Google Scholar 

  10. González M, Bergmeir C, Triguero I, Rodríguez Y, Benítez JM (2016) On the stopping criteria for k-nearest neighbor in positive unlabeled time series classification problems. Inf Sci 328:42–59

    Article  MATH  Google Scholar 

  11. Martin D, Fdez JA, Rosete A, Herrera F (2016) NICGAR: a niching genetic algorithm to mine a diverse set of interesting quantitative association rules. Inf Sci 355–356:208–228

    Article  Google Scholar 

  12. Butt AH, Khan SA, Jamil H, Rasool N, Khan YD (2016) A prediction model for membrane proteins using moments based features. Biomed Res Int 2016:8370132. doi:10.1155/2016/8370132

    Article  Google Scholar 

  13. Vala MHJ, Baxi A (2013) A review on otsu image segmentation algorithm. Int J Adv Res Comput Eng Technol 2(2):387–389 (ISSN: 2278–1323)

    Google Scholar 

  14. Akbal-Delibas B, Farhoodi R, Pomplun M, Haspel N (2016) Accurate refinement of docked protein complexes using evolutionary information and deep learning. J Bioinform Comput Biol 14(3):1642002. doi:10.1142/S0219720016420026

    Article  Google Scholar 

  15. Wang B, Wang M, Jiang Y, Sun D, Xu X (2015) A novel network-based computational method to predict protein phosphorylation on tyrosine sites. J Bioinform Comput Biol 13:1542005. doi:10.1142/S0219720015420056

    Article  Google Scholar 

  16. Wang D, Hou J (2015) Explore the hidden treasure in protein–protein interaction networks—an iterative model for predicting protein functions. J Bioinform Comput Biol 13(5):1550026. doi:10.1142/S0219720015500262

    Article  Google Scholar 

  17. Watson JD, Laskowski RA, Thornton JM (2005) Predicting protein function from sequence and structural data. Curr Opin Struct Biol 15(3):275–284

    Article  Google Scholar 

  18. Tan S, Guan Z, Cai D, Qin X, Bu J, Chen C (2014) Mapping users across networks by manifold alignment on hypergraph. In Proceedings of the twenty-eighth AAAI conference on artificial intelligence (AAAI’14), 159–165

  19. Bangyal W, Jamil A, Shafi I, Abbas Q (2011) propagation network-based approach for contraceptive method choice classification task. J Exp Theor Artif Intell 24(2):211–218

    Article  Google Scholar 

  20. Brereton RG, Lloyda GR (2010) Support vector machines for classification and regression. Analyst. doi:10.1039/B918972F

    Article  Google Scholar 

  21. Iranmanesh A, Fahimi M (2001) Genetic algorithm trained counter-propagation neural net in structural optimization. In: Proceedings of the sixth international conference on Application of artificial intelligence to civil and structural engineering (ICAAICSE ‘01), Topping BHV, Kumar B (Eds.). Civil-Comp Press, pp. 85-86

  22. Bollen J, Van de Sompel H, Hagberg A, Chute R (2009) A principal component analysis of 39 scientific impact measures. PLoS One 4(6):e6022. doi:10.1371/journal.pone.0006022

    Article  Google Scholar 

  23. MacQueen JB (1967) “Some methods for classification and analysis of multivariate observations. In: Proceedings of 5-th Berkeley symposium on mathematical statistics and probability”. Berkeley, University of California Press, 1:281–297

  24. Yuan X, Martínez J-F, Eckert M, López-Santidrián L (2016) An improved Otsu threshold segmentation method for underwater simultaneous localization and mapping-based navigation. Sensors 16(7):1148. doi:10.3390/s16071148

    Article  Google Scholar 

  25. Xu ZB, Chen PJ, Yan SL, Wang TH (2014) Study on Otsu threshold method for image segmentation based on genetic algorithm. Adv Mater Res 999:925–928

    Article  Google Scholar 

  26. Hegde GP, Seetha M, Hegde N (2016) Kernel locality preserving symmetrical weighted fisher discriminant analysis based subspace approach for expression recognition. Int J Eng Sci Technol 19(3):1321–1333. doi:10.1016/j.jestch.2016.03.005

    Article  Google Scholar 

  27. Taormina R, Chau KW (2015) Data-driven input variable selection for rainfall–runoff modeling using binary-coded particle swarm optimization and extreme learning machines. J Hydrol 529:1617–1632

    Article  Google Scholar 

  28. Pedruzzi I, Rivoire C, Auchincloss AH et al (2013) HAMAP in 2013, new developments in the protein family classification and annotation system. Nucleic Acids Res 41(D1):D584–D589. doi:10.1093/nar/gks1157

    Article  Google Scholar 

  29. Maddouri RSM, Nguifo EM (2010) Protein sequences classification by means of feature extraction with substitution matrices. BMC Bioinform 11:175

    Article  Google Scholar 

  30. Bernardes JS, Fernandez JH, Vasconcelos ATR (2008) Structural descriptor database: a new tool for sequence-based functional site prediction. BMC Bioinform 9:492

    Article  Google Scholar 

  31. Yan R-X, Si J-N, Wang C, Zhang Z (2009) DescFold: a web server for protein fold recognition. BMC Bioinform 10:416

    Article  Google Scholar 

  32. Rost B, Liu J, Nair R, Wrzeszczynski KO, Ofran Y (2003) Automatic prediction of protein function. Cell Mol Life Sci. 60(12):2637–2650

    Article  Google Scholar 

  33. Baugh EH, Simmons-Edler R, Müller CL, Alford RF, Volfovsky N, Lash AE, Bonneau R (2016) Robust classification of protein variation using structural modelling and large-scale data integration. Oxf J Sci Math Nucleic Acids Res 44(6):2501–2513

    Article  Google Scholar 

  34. Dinubhai PM, Shah HB (2013) Comparative study of multi-class protein structure prediction using advanced soft computing techniques. Int J Eng Sci Innov Technol 2(2):275–282

    Google Scholar 

  35. Burkhardt K, Schneider B, Ory J (2006) A biocurator perspective: annotation at the research collaboratory for structural bioinformatics protein data bank. PLoS Comput Biol 2(10):e99. doi:10.1371/journal.pcbi.0020099

    Article  Google Scholar 

  36. Li YH, Xu JY, Tao L, Li XF, Li S et al (2016) SVM-Prot 2016: a web-server for machine learning prediction of protein functional families from sequence irrespective of similarity. PLos One 11(8):e0155290. doi:10.1371/journal.pone.0155290

    Article  Google Scholar 

  37. Cai Y-D, Liu X-J, Xu X-B, Zhou G-P (2001) Support vector machines for predicting protein structural class. BMC Bioinform 2:3

    Article  Google Scholar 

  38. Selvaraj MK, Puri M, Dikshit KL, Lefevre C (2016) BacHbpred: support vector machine methods for the prediction of bacterial hemoglobin-like proteins. Adv Bioinform 2016:8150784. doi:10.1155/2016/8150784

    Article  Google Scholar 

  39. Dhifli W, Diallo AB (2016) ProtNN: fast and accurate nearest neighbor protein function prediction based on graph embedding in structural and topological space, Cornell University, pp 1–28

  40. Krissinel E, Henrick K (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr Sect D Biol Crystallogr 60(12):2256–2268

    Article  Google Scholar 

  41. Bhattacharya S, Bhattacharyya C, Chandra NR (2007) Comparison of protein structures by growing neighborhood alignments. BMC Bioinform 8:77. doi:10.1186/1471-2105-8-77

    Article  Google Scholar 

  42. Nandanwar S, Murty MN Structural neighborhood based classification of nodes in a network. In: Proceeding, KDD ‘16 Proceedings of the 22nd ACM SIGKDD international conference on knowledge, discovery and data mining, pp. 1085–1094, ACM New York, NY, USA

  43. Bhatia N, Vandana SSCS (2010) Survey of nearest neighbor techniques. Int J Comput Sci Inf Secur 8:302–305

    Google Scholar 

  44. Desrosiers C, Karypis G (2010) A comprehensive survey of neighborhood-based recommendation methods. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, Boston, pp 107–144. doi:10.1007/978-0-387-85820-3_4

  45. Hadley C, Jones DT (1999) A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. Structure 7(9):1099–1112

    Article  Google Scholar 

  46. Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540

    Google Scholar 

  47. Hore S, Chatterjee S, Sarkar S, Dey N, Ashour AS, Balas-Timar D, Balas VE (2016) Neural-based prediction of structural failure of multistoried RC buildings. Struct Eng Mech 58(3):459–473

    Article  Google Scholar 

  48. Zhang J, Chau KW (2009) Multilayer ensemble pruning via novel multi-sub-swarm particle swarm optimization. J UCS 15(4):840–858

    Google Scholar 

  49. Sharma K, Virmani J (2017) A decision support system for classification of normal and medical renal disease using ultrasound images: a decision support system for medical renal diseases. Int J Ambient Comput Intell 8(2):52–69

    Article  Google Scholar 

  50. Wang WC, Chau KW, Xu DM, Chen XY (2015) Improving forecasting accuracy of annual runoff time series using ARIMA based on EEMD decomposition. Water Resour Manag 29(8):2655–2675

    Article  Google Scholar 

  51. Li Z, Shi K, Dey N, Ashour AS, Wang D, Balas VE et al (2017) Rule-based back propagation neural networks for various precision rough set presented KANSEI knowledge prediction: a case study on shoe product form features extraction. Neural Comput Appl 28(3):613–630

    Article  Google Scholar 

  52. Manogaran G, Lopez D (2017) Disease surveillance system for big climate data processing and dengue transmission. Int J Ambient Comput Intell 8(2):88–105

    Article  Google Scholar 

  53. Zhang S, Chau KW (2009) Dimension reduction using semi-supervised locally linear embedding for plant leaf classification. In: International conference on intelligent computing. Springer, Berlin, pp 948–955. doi:10.1007/978-3-642-04070-2_100

  54. Wu CL, Chau KW, Li YS (2009) Methods to improve neural network performance in daily flows prediction. J Hydrol 372(1):80–93

    Article  Google Scholar 

  55. Chau KW, Wu CL (2010) A hybrid model coupled with singular spectrum analysis for daily rainfall prediction. J Hydroinform 12(4):458–473

    Article  Google Scholar 

  56. Wang XZ, He YL, Dong LC, Zhao HY (2011) Particle swarm optimization for determining fuzzy measures from data. Inf Sci 181(19):4230–4252

    Article  MATH  Google Scholar 

  57. Wang XZ, Xing HJ, Li Y, Hua Q, Dong CR, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654

    Article  Google Scholar 

  58. Nimmy SF, Kamal MS (2015) Next generation sequencing under De-Novo genome assembly. Int Journal of Biomath 8(5):1–29

    Article  MATH  Google Scholar 

  59. Kamal MS, Khan MI (2014) performance evaluation of Warshall algorithm and dynamic programming for markov chain in local sequence alignment. Interdiscip Sci Comput Life Sci 7(1):78–81

    Google Scholar 

  60. Kamal MS, Khan MI (2014) An integrated algorithm for local sequence alignment. Netw Model Anal Health Inform Bioinforma 3:1–9. doi:10.1007/s13721-014-0068-8

    Article  Google Scholar 

  61. Chatterjee S, Hore S, Dey N, Chakraborty S, Ashour AS (2016) Dengue fever classification using gene expression data: a PSO based artificial neural network approach. In: 5th International conference on frontiers in intelligent computing: theory and applications, volume: Springer AISC

  62. Wang D, He T, Li Z, Cao L, Dey N, Ashour AS, Balas VE, McCauley P, Lin Y, Xu J, Shi F (2016) Image feature-based affective retrieval employing improved parameter and structure identification of adaptive neuro-fuzzy inference system. Neural Comput Appl. doi:10.1007/s00521-016-2512-4

    Article  Google Scholar 

  63. Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M (2016) O. J. NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res 44(14):6614–6624

    Article  Google Scholar 

  64. Tateno Y, Miyazaki S, Ota M, Sugawara H, Gojobori T (2000) DNA Data Bank of Japan (DDBJ) in collaboration with mass sequencing teams. Nucleic Acids Res 28:24–26 (Updated article in this issue: Nucleic Acids Res. (2002), 30, 27–30)

    Article  Google Scholar 

  65. Benson DA, K-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL (2000) GenBank. Nucleic Acids Res 28:15–18

    Article  Google Scholar 

  66. Schmuker M, Schwarte F, Brück A, Proschak E, Tanrikulu Y, Givehchi A, Scheiffele K, Schneider G (2007) SOMMER: self-organising maps for education and research. J Mol Model 13:225–228

    Article  Google Scholar 

  67. Faigl J (2016) An application of self-organizing map for multirobot multigoal path planning with minmax objective. Comput Intell Neurosci 2016:2720630. doi:10.1155/2016/2720630

    Article  Google Scholar 

  68. Muñoz A, Muruzábal J (1998) Self-organizing maps for outlier detection. Neurocomputing 18(1):33–60. doi:10.1016/S0925-2312(97)00068-4

    Article  Google Scholar 

  69. Rini DP, Shamsuddin SM, Yuhaniz SS (2011) Particle swarm optimization: technique, system and challenges. Int J Comput Appl 14(1):19–27

    Google Scholar 

  70. Hu X, Shi Y, Eberhart R (2004) Recent advances in particle swarm. Evol Comput 1:90–97 (CEC2004)

    Google Scholar 

  71. Kohonen T (1995) Self-organizing maps. Springer, New York

    Book  MATH  Google Scholar 

  72. Bai Q (2010) Analysis of particle swarm optimization algorithm. Comput Inf Sci 3(1). doi:10.5539/cis.v3n1p180

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amira S. Ashour.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kamal, M.S., Sarowar, M.G., Dey, N. et al. Self-organizing mapping based swarm intelligence for secondary and tertiary proteins classification. Int. J. Mach. Learn. & Cyber. 10, 229–252 (2019). https://doi.org/10.1007/s13042-017-0710-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-017-0710-8

Keywords

Navigation