Abstract
Proteins have a significant role in animals and human health. Interactions among proteins are complex and large. Proteins separations are challenging process in molecular biology. Computational tools help to simulate the analysis in order to reduce the training data into small testing data. Large proteins have been mapped using self-organizing maps (SOMs). Neural network based SOMs has a significant role in reducing the irregular shapes of proteins interactions. Iterative checking enables the organizations of all proteins. In next stage, particle swarm intelligence is applied to classify the proteins’ families. In the current work, secondary (Two dimensional) and tertiary proteins (Three dimensional) proteins have been grouped. Two dimensional proteins contain fewer hydro-carbons than three dimensional proteins. For faster analysis, the angles of the proteins are taken into account. The SOMs is compared with Bounding Box approach. In final, the experimental evolutions show that swarm intelligence achieved faster processing through enabling less memory consumptions and time. Since PSO combines proteins datasets in fuzzy values, the compactness or integration of similar proteins are strong. On the other hand, Bounding Box uses the Crisp value. Therefore, it needs more space to organize the whole data. Without SOMs, swarm intelligence also results are poor due to the excessive time consuming and required storage area. Moreover, for almost all classification and clustering tools, it is observed that the overall classification task becomes slow, time consuming, space consuming and also less sensitive because of noises, irrelevant data in input datasets. Thus, the proposed SOM based PSO approach achieved less time consuming with efficient classification into secondary and tertiary proteins.
Similar content being viewed by others
References
Turcu A, Palmieri R, Ravindran B, Hirve S (2016) Automated data partitioning for highly scalable and strongly consistent transactions. IEEE Trans Parallel Distrib Syst 27(1):106–118
Chien JT, KuBayesian YC (2016) Recurrent neural network for language modeling. IEEE Trans Neural Netw Learn Syst 27(2):361–374
Deng SP, Zhu L, Huang DS (2016) Predicting hub genes associated with cervical cancer through gene co-expression networks. IEEE/ACM Trans Comput Biol Bioinform 13(1):27–35
Hsieh SY, Chou YC (2016) A Faster cDNA microarray gene expression data classifier for diagnosing diseases. IEEE/ACM Trans Comput Biol Bioinform 13(1):43–54
Dhulekar N, Ray S, Yuan D, Baskaran A, Oztan B, Larsen M, Yene B (2016) Prediction of growth factor-dependent cleft formation during branching morphogenesis using a dynamic graph-based growth model. IEEE/ACM Trans Comput Biol Bioinform 13(2):350–363
Sáez JA, Luengo J, Herrera F (2016) Evaluating the classifier behavior with noisy data considering performance and robustness: the equalized loss of accuracy measure. Neurocomputing 176:26–35
Saez JA, Galar M, Luengo J, Herrera F (2016) INFFC: an iterative class noise filter based on the fusion of classifiers with noise sensitivity control. Inf Fusion 27:505–636
Fdez JA, Alonso JM (2016) A survey of fuzzy systems software: taxonomy, current research trends and prospects. IEEE Trans Fuzzy Syst 24(1):40–56
Palacios A, Sanchez L, Couso I (2016) An extension of the FURIA classification algorithm to low quality data through fuzzy rankings and its application to the early diagnosis of dyslexia. Neurocomputing 176:60–71
González M, Bergmeir C, Triguero I, Rodríguez Y, Benítez JM (2016) On the stopping criteria for k-nearest neighbor in positive unlabeled time series classification problems. Inf Sci 328:42–59
Martin D, Fdez JA, Rosete A, Herrera F (2016) NICGAR: a niching genetic algorithm to mine a diverse set of interesting quantitative association rules. Inf Sci 355–356:208–228
Butt AH, Khan SA, Jamil H, Rasool N, Khan YD (2016) A prediction model for membrane proteins using moments based features. Biomed Res Int 2016:8370132. doi:10.1155/2016/8370132
Vala MHJ, Baxi A (2013) A review on otsu image segmentation algorithm. Int J Adv Res Comput Eng Technol 2(2):387–389 (ISSN: 2278–1323)
Akbal-Delibas B, Farhoodi R, Pomplun M, Haspel N (2016) Accurate refinement of docked protein complexes using evolutionary information and deep learning. J Bioinform Comput Biol 14(3):1642002. doi:10.1142/S0219720016420026
Wang B, Wang M, Jiang Y, Sun D, Xu X (2015) A novel network-based computational method to predict protein phosphorylation on tyrosine sites. J Bioinform Comput Biol 13:1542005. doi:10.1142/S0219720015420056
Wang D, Hou J (2015) Explore the hidden treasure in protein–protein interaction networks—an iterative model for predicting protein functions. J Bioinform Comput Biol 13(5):1550026. doi:10.1142/S0219720015500262
Watson JD, Laskowski RA, Thornton JM (2005) Predicting protein function from sequence and structural data. Curr Opin Struct Biol 15(3):275–284
Tan S, Guan Z, Cai D, Qin X, Bu J, Chen C (2014) Mapping users across networks by manifold alignment on hypergraph. In Proceedings of the twenty-eighth AAAI conference on artificial intelligence (AAAI’14), 159–165
Bangyal W, Jamil A, Shafi I, Abbas Q (2011) propagation network-based approach for contraceptive method choice classification task. J Exp Theor Artif Intell 24(2):211–218
Brereton RG, Lloyda GR (2010) Support vector machines for classification and regression. Analyst. doi:10.1039/B918972F
Iranmanesh A, Fahimi M (2001) Genetic algorithm trained counter-propagation neural net in structural optimization. In: Proceedings of the sixth international conference on Application of artificial intelligence to civil and structural engineering (ICAAICSE ‘01), Topping BHV, Kumar B (Eds.). Civil-Comp Press, pp. 85-86
Bollen J, Van de Sompel H, Hagberg A, Chute R (2009) A principal component analysis of 39 scientific impact measures. PLoS One 4(6):e6022. doi:10.1371/journal.pone.0006022
MacQueen JB (1967) “Some methods for classification and analysis of multivariate observations. In: Proceedings of 5-th Berkeley symposium on mathematical statistics and probability”. Berkeley, University of California Press, 1:281–297
Yuan X, Martínez J-F, Eckert M, López-Santidrián L (2016) An improved Otsu threshold segmentation method for underwater simultaneous localization and mapping-based navigation. Sensors 16(7):1148. doi:10.3390/s16071148
Xu ZB, Chen PJ, Yan SL, Wang TH (2014) Study on Otsu threshold method for image segmentation based on genetic algorithm. Adv Mater Res 999:925–928
Hegde GP, Seetha M, Hegde N (2016) Kernel locality preserving symmetrical weighted fisher discriminant analysis based subspace approach for expression recognition. Int J Eng Sci Technol 19(3):1321–1333. doi:10.1016/j.jestch.2016.03.005
Taormina R, Chau KW (2015) Data-driven input variable selection for rainfall–runoff modeling using binary-coded particle swarm optimization and extreme learning machines. J Hydrol 529:1617–1632
Pedruzzi I, Rivoire C, Auchincloss AH et al (2013) HAMAP in 2013, new developments in the protein family classification and annotation system. Nucleic Acids Res 41(D1):D584–D589. doi:10.1093/nar/gks1157
Maddouri RSM, Nguifo EM (2010) Protein sequences classification by means of feature extraction with substitution matrices. BMC Bioinform 11:175
Bernardes JS, Fernandez JH, Vasconcelos ATR (2008) Structural descriptor database: a new tool for sequence-based functional site prediction. BMC Bioinform 9:492
Yan R-X, Si J-N, Wang C, Zhang Z (2009) DescFold: a web server for protein fold recognition. BMC Bioinform 10:416
Rost B, Liu J, Nair R, Wrzeszczynski KO, Ofran Y (2003) Automatic prediction of protein function. Cell Mol Life Sci. 60(12):2637–2650
Baugh EH, Simmons-Edler R, Müller CL, Alford RF, Volfovsky N, Lash AE, Bonneau R (2016) Robust classification of protein variation using structural modelling and large-scale data integration. Oxf J Sci Math Nucleic Acids Res 44(6):2501–2513
Dinubhai PM, Shah HB (2013) Comparative study of multi-class protein structure prediction using advanced soft computing techniques. Int J Eng Sci Innov Technol 2(2):275–282
Burkhardt K, Schneider B, Ory J (2006) A biocurator perspective: annotation at the research collaboratory for structural bioinformatics protein data bank. PLoS Comput Biol 2(10):e99. doi:10.1371/journal.pcbi.0020099
Li YH, Xu JY, Tao L, Li XF, Li S et al (2016) SVM-Prot 2016: a web-server for machine learning prediction of protein functional families from sequence irrespective of similarity. PLos One 11(8):e0155290. doi:10.1371/journal.pone.0155290
Cai Y-D, Liu X-J, Xu X-B, Zhou G-P (2001) Support vector machines for predicting protein structural class. BMC Bioinform 2:3
Selvaraj MK, Puri M, Dikshit KL, Lefevre C (2016) BacHbpred: support vector machine methods for the prediction of bacterial hemoglobin-like proteins. Adv Bioinform 2016:8150784. doi:10.1155/2016/8150784
Dhifli W, Diallo AB (2016) ProtNN: fast and accurate nearest neighbor protein function prediction based on graph embedding in structural and topological space, Cornell University, pp 1–28
Krissinel E, Henrick K (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr Sect D Biol Crystallogr 60(12):2256–2268
Bhattacharya S, Bhattacharyya C, Chandra NR (2007) Comparison of protein structures by growing neighborhood alignments. BMC Bioinform 8:77. doi:10.1186/1471-2105-8-77
Nandanwar S, Murty MN Structural neighborhood based classification of nodes in a network. In: Proceeding, KDD ‘16 Proceedings of the 22nd ACM SIGKDD international conference on knowledge, discovery and data mining, pp. 1085–1094, ACM New York, NY, USA
Bhatia N, Vandana SSCS (2010) Survey of nearest neighbor techniques. Int J Comput Sci Inf Secur 8:302–305
Desrosiers C, Karypis G (2010) A comprehensive survey of neighborhood-based recommendation methods. In: Ricci F, Rokach L, Shapira B, Kantor PB (eds) Recommender systems handbook. Springer, Boston, pp 107–144. doi:10.1007/978-0-387-85820-3_4
Hadley C, Jones DT (1999) A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. Structure 7(9):1099–1112
Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247:536–540
Hore S, Chatterjee S, Sarkar S, Dey N, Ashour AS, Balas-Timar D, Balas VE (2016) Neural-based prediction of structural failure of multistoried RC buildings. Struct Eng Mech 58(3):459–473
Zhang J, Chau KW (2009) Multilayer ensemble pruning via novel multi-sub-swarm particle swarm optimization. J UCS 15(4):840–858
Sharma K, Virmani J (2017) A decision support system for classification of normal and medical renal disease using ultrasound images: a decision support system for medical renal diseases. Int J Ambient Comput Intell 8(2):52–69
Wang WC, Chau KW, Xu DM, Chen XY (2015) Improving forecasting accuracy of annual runoff time series using ARIMA based on EEMD decomposition. Water Resour Manag 29(8):2655–2675
Li Z, Shi K, Dey N, Ashour AS, Wang D, Balas VE et al (2017) Rule-based back propagation neural networks for various precision rough set presented KANSEI knowledge prediction: a case study on shoe product form features extraction. Neural Comput Appl 28(3):613–630
Manogaran G, Lopez D (2017) Disease surveillance system for big climate data processing and dengue transmission. Int J Ambient Comput Intell 8(2):88–105
Zhang S, Chau KW (2009) Dimension reduction using semi-supervised locally linear embedding for plant leaf classification. In: International conference on intelligent computing. Springer, Berlin, pp 948–955. doi:10.1007/978-3-642-04070-2_100
Wu CL, Chau KW, Li YS (2009) Methods to improve neural network performance in daily flows prediction. J Hydrol 372(1):80–93
Chau KW, Wu CL (2010) A hybrid model coupled with singular spectrum analysis for daily rainfall prediction. J Hydroinform 12(4):458–473
Wang XZ, He YL, Dong LC, Zhao HY (2011) Particle swarm optimization for determining fuzzy measures from data. Inf Sci 181(19):4230–4252
Wang XZ, Xing HJ, Li Y, Hua Q, Dong CR, Pedrycz W (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654
Nimmy SF, Kamal MS (2015) Next generation sequencing under De-Novo genome assembly. Int Journal of Biomath 8(5):1–29
Kamal MS, Khan MI (2014) performance evaluation of Warshall algorithm and dynamic programming for markov chain in local sequence alignment. Interdiscip Sci Comput Life Sci 7(1):78–81
Kamal MS, Khan MI (2014) An integrated algorithm for local sequence alignment. Netw Model Anal Health Inform Bioinforma 3:1–9. doi:10.1007/s13721-014-0068-8
Chatterjee S, Hore S, Dey N, Chakraborty S, Ashour AS (2016) Dengue fever classification using gene expression data: a PSO based artificial neural network approach. In: 5th International conference on frontiers in intelligent computing: theory and applications, volume: Springer AISC
Wang D, He T, Li Z, Cao L, Dey N, Ashour AS, Balas VE, McCauley P, Lin Y, Xu J, Shi F (2016) Image feature-based affective retrieval employing improved parameter and structure identification of adaptive neuro-fuzzy inference system. Neural Comput Appl. doi:10.1007/s00521-016-2512-4
Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Nawrocki EP, Zaslavsky L, Lomsadze A, Pruitt KD, Borodovsky M (2016) O. J. NCBI prokaryotic genome annotation pipeline. Nucleic Acids Res 44(14):6614–6624
Tateno Y, Miyazaki S, Ota M, Sugawara H, Gojobori T (2000) DNA Data Bank of Japan (DDBJ) in collaboration with mass sequencing teams. Nucleic Acids Res 28:24–26 (Updated article in this issue: Nucleic Acids Res. (2002), 30, 27–30)
Benson DA, K-Mizrachi I, Lipman DJ, Ostell J, Rapp BA, Wheeler DL (2000) GenBank. Nucleic Acids Res 28:15–18
Schmuker M, Schwarte F, Brück A, Proschak E, Tanrikulu Y, Givehchi A, Scheiffele K, Schneider G (2007) SOMMER: self-organising maps for education and research. J Mol Model 13:225–228
Faigl J (2016) An application of self-organizing map for multirobot multigoal path planning with minmax objective. Comput Intell Neurosci 2016:2720630. doi:10.1155/2016/2720630
Muñoz A, Muruzábal J (1998) Self-organizing maps for outlier detection. Neurocomputing 18(1):33–60. doi:10.1016/S0925-2312(97)00068-4
Rini DP, Shamsuddin SM, Yuhaniz SS (2011) Particle swarm optimization: technique, system and challenges. Int J Comput Appl 14(1):19–27
Hu X, Shi Y, Eberhart R (2004) Recent advances in particle swarm. Evol Comput 1:90–97 (CEC2004)
Kohonen T (1995) Self-organizing maps. Springer, New York
Bai Q (2010) Analysis of particle swarm optimization algorithm. Comput Inf Sci 3(1). doi:10.5539/cis.v3n1p180
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kamal, M.S., Sarowar, M.G., Dey, N. et al. Self-organizing mapping based swarm intelligence for secondary and tertiary proteins classification. Int. J. Mach. Learn. & Cyber. 10, 229–252 (2019). https://doi.org/10.1007/s13042-017-0710-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-017-0710-8