Protein Data Condensation for Effective Quaternary Structure Classification
Many proteins are composed of two or more subunits, each associated with different polypeptide chains. The number and the arrangement of subunits forming a protein are referred to as quaternary structure. The quaternary structure of a protein is important, since it characterizes the biological function of the protein when it is involved in specific biological processes. Unfortunately, quaternary structures are not trivially deducible from protein amino acid sequences. In this work, we propose a protein quaternary structure classification method exploiting the functional domain composition of proteins. It is based on a nearest neighbor condensation technique in order to reduce both the portion of dataset to be stored and the number of comparisons to carry out. Our approach seems to be promising, in that it guarantees an high classification accuracy, even though it does not require the entire dataset to be analyzed. Indeed, experimental evaluations show that the method here proposed selects a small dataset portion for the classification (of the order of the 6.43%) and that it is very accurate (97.74%).
Unable to display preview. Download preview PDF.
- 1.Angiulli, F.: Fast condensend nearest neighbor rule. In: Proc. of the 22nd International Conference on Machine Learning, Bonn, Germany (2005)Google Scholar
- 11.Kim, W.K., Park, J., Suh, J.K.: Large scale statistical prediction of protein-protein interaction by potentially interacting domain (pid) pair. In: Genome informatics. International Conference on Genome Informatics, vol. 13, pp. 42–50 (2002)Google Scholar
- 13.Lesk, A.M.: Introduction to Protein Architecture. Oxford University Press, Oxford (2001)Google Scholar
- 19.Wojcik, J., Schachter, V.: Protein-protein interaction map inference using interacting domain profile pairs. Bioinformatics 17(1), 296–305 (2001)Google Scholar
- 20.Yu, X., Lin, J., Shi, T., Li, Y.: A novel domain-based method for predicting the functional classes of proteins. Chinese Science Bullettin - English Edition- 49(22), 2379–2384 (2004)Google Scholar
- 21.Yu, X., Wang, C., Li, Y.: Classification of protein quaternary structure by functional domain composition. BMC Bioinformatics 7(187) (2006)Google Scholar