A Brief Review of Data Mining Application Involving Protein Sequence Classification

  • Suprativ SahaEmail author
  • Rituparna Chaki
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 177)


Data mining techniques have been used by researchers for analyzing protein sequences. In protein analysis, especially in protein sequence classification, selection of feature is most important. Popular protein sequence classification techniques involve extraction of specific features from the sequences. Researchers apply some well-known classification techniques like neural networks, Genetic algorithm, Fuzzy ARTMAP, Rough Set Classifier etc for accurate classification. This paper presents a review is with three different classification models such as neural network model, fuzzy ARTMAP model and Rough set classifier model. A new technique for classifying protein sequences have been proposed in the end. The proposed technique tries to reduce the computational overheads encountered by earlier approaches and increase the accuracy of classification.


Data Mining Neural Network Model Fuzzy ARTMAP Model Rough Set Classifier 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Wu, C., Berry, M., Shivakumar, S., Mclarty, J.: Neural Networks for Full-Scale Protein Sequence Classification: Sequence Encoding with Singular Value Decomposition 21, 177–193 (1995)Google Scholar
  2. 2.
    Wang, J.T.L., Ma, Q.H., Shasha, D., Wu, C.H.: Application of Neural Networks to Biological Data Mining: A case study in Protein Sequence Classification. In: KDD, Boston, MA, USA, pp. 305–309 (2000)Google Scholar
  3. 3.
    Cai, C.Z., Han, L.Y., Ji, Z.L., Chen, X., Chen, Y.Z.: SVM-Prot: Web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Research 31, 3692–3697 (2003)CrossRefGoogle Scholar
  4. 4.
    Wang, D., Huang, G.-B.: Protein Sequence Classification Using Extreme Learning Machine. In: Proceedings of International Joint Conference on Neural Networks, IJCNN 2005, Montreal, Canada (2005)Google Scholar
  5. 5.
    Mohamed, S., Rubin, D., Marwala, T.: Multi-class Protein Sequence Classification Using Fuzzy ARTMAP. In: IEEE Conference, pp. 1676–1680 (2006)Google Scholar
  6. 6.
    Mansoori, E.G., Zolghadri, M.J., Katebi, S.D., Mohabatkar, H., Boostani, R., Sadreddini, M.H.: Generating Fuzzy Rules For Protein Classification. Iranian Journal of Fuzzy Systems 5(2), 21–33 (2008)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Zainuddin, Z., Kumar, M.: Radial Basic Function Neural Networks in Protein Sequence Classification. Malaysian Journal of Mathematical Science 2(2), 195–204 (2008)Google Scholar
  8. 8.
    Mansoori, E.G., Zolghadri, M.J., Katebi, S.D.: Protein Superfamily Classification Using Fuzzy Rule-Based Classifier. IEEE Transactions on Nanobioscience 8(1), 92–99 (2009)CrossRefGoogle Scholar
  9. 9.
    Nageswara Rao, P.V., Uma Devi, T., Kaladhar, D., Sridhar, G., Rao, A.A.: A Probabilistic Neural Network Approach For Protein Super-family Classification. Journal of Theoretical and Applied Information Technology (2009)Google Scholar
  10. 10.
    Yellasiri, R., Rao, C.R.: Rough Set Protein Classifier. Journal of Theoretical and Applied Information Technology (2009)Google Scholar
  11. 11.
    Rahman, S.A., Bakar, A.A., Hussein, Z.A.M.: Feature Selection and Classification of Protein Subfamilies Using Rough Sets. In: International Conference on Electrical Engineering and Informatics, Selangor, Malaysia (2009)Google Scholar
  12. 12.
    Tzanis, G., Berberidis, C., Vlahavas, I.: Biological Data MiningGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Department of Computer Sc. and EngineeringWest Bengal University of TechnologyKolkataIndia

Personalised recommendations