Granular Approach for Protein Sequence Analysis

  • Ying Xie
  • Jonathan Fisher
  • Vijay V. Raghavan
  • Tom Johnsten
  • Can Akkoc
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7413)


Granular computing uses granules as basic units to compute with. Granules can be formed by either information abstraction or information decomposition. In this paper, we view information decomposition as a paradigm for processing data with complex structures. More specifically, we apply lossless information decomposition to protein sequence analysis. By decomposing a protein sequence into a set of proper granules and applying dynamic programming to align the position sequences of two corresponding granules, we are able to distribute the calculation of pairwise similarity of protein sequences to multiple parallel processes, each of which is less time consuming than the calculation based on an alignment of original sequences.


Position Series Position Sequence Pairwise Similarity Information Granulation Protein Sequence Analysis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Yao, Y.: The Art of Granular Computing. In: Kryszkiewicz, M., Peters, J.F., Rybiński, H., Skowron, A. (eds.) RSEISP 2007. LNCS (LNAI), vol. 4585, pp. 101–112. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  2. 2.
    Zadeh, L.: Some reflections on soft computing, granular computing and their roles in the conception, design and utilization of information/intelligent systems. Soft Computing, 23–25 (1998)Google Scholar
  3. 3.
    Yao, J.T.: Recent Developments in Granular Computing: A Bibliometrics Study. In: Proceedings of IEEE International Conference on Granular Computing, Hangzhou, China, pp. 74–79 (2008)Google Scholar
  4. 4.
    Yao, J.T.: A Ten-Year Review of Granular Computing. In: Proceedings of 2007 IEEE International Conference on Granular Computing, Sillicon Valley, CA, USA, pp. 734–739 (2007)Google Scholar
  5. 5.
    Lin, T.: Granular computing of binary relations I: data mining and neighborhood systems. In: Polkowski, Skowron (eds.) Rough Sets and Knowledge Discovery, pp. 107–121. Physica-Verlag (1998)Google Scholar
  6. 6.
    Lin, T.Y.: A Roadmap from Rough Set Theory to Granular Computing. In: Wang, G.-Y., Peters, J.F., Skowron, A., Yao, Y. (eds.) RSKT 2006. LNCS (LNAI), vol. 4062, pp. 33–41. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Yao, Y., Liau, C., Zhong, N.: Granular Computing Based on Rough Sets, Quotient Space Theory, and Belief Functions. In: Zhong, N., Raś, Z.W., Tsumoto, S., Suzuki, E. (eds.) ISMIS 2003. LNCS (LNAI), vol. 2871, pp. 152–159. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  8. 8.
    Yao, Y.: Information granulation and rough set approximation. International Journal of Intelligent Systems, 87–104 (2001)Google Scholar
  9. 9.
    Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasonging about Data. Kluwer Academic Publishers (1991)Google Scholar
  10. 10.
    Needleman, B., Wunsch, D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology, 443–453 (1970)Google Scholar
  11. 11.
    Smith, F., Waterman, S.: Identification of Common Molecular Subsequences. Journal of Molecular Biology, 195–197 (1981)Google Scholar
  12. 12.
    Leslie, C., Eskin, E., Weston, J., Noble, W.: Mismatch String Kernels for SVM Protein Classification. In: Advances in Neural Information Processing Systems, NIPS 2002, Vancouver, British Columbia, Canada, December 9-14, pp. 1417–1424 (2002)Google Scholar
  13. 13.
    Akkoc, C., Johnsten, T., Benton, R.: Multi-layered Vector Spaces for Classifying and Analyzing Biological Sequences. In: Proceedings of 2011 International Conference on Bioinformatics and Computational Biology, New Orleans, pp. 160–166 (2011)Google Scholar
  14. 14.
    Liao, L., Noble, S.: Combining pairwise sequence similarity and support vector machines for detecting remote protein evolutionary and structural relationships. Journal of Computational Biology, 857–868 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ying Xie
    • 1
  • Jonathan Fisher
    • 1
  • Vijay V. Raghavan
    • 2
  • Tom Johnsten
    • 3
  • Can Akkoc
    • 4
  1. 1.Department of Computer ScienceKennesaw State UniversityGeorgiaUSA
  2. 2.Center for Advanced Computer StudiesUniversity of Louisiana at LafayetteLouisianaUSA
  3. 3.School of Computer and Information SciencesUniversity of South AlabamaAlabamaUSA
  4. 4.Institute of Applied MathematicsThe Middle East Technical UniversityAnkaraTurkey

Personalised recommendations