Prediction of Domain-Domain Interactions Using Inductive Logic Programming from Multiple Genome Databases

  • Thanh Phuong Nguyen
  • Tu Bao Ho
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4265)


Protein domains are the building blocks of proteins, and their interactions are crucial in forming stable protein-protein interactions (PPI) and take part in many cellular processes and biochemical events. Prediction of protein domain-domain interactions (DDI) is an emerging problem in computational biology. Different from early works on DDI prediction, which exploit only a single protein database, we introduce in this paper an integrative approach to DDI prediction that exploits multiple genome databases using inductive logic programming (ILP). The main contribution to biomedical knowledge discovery of this work are a newly generated database of more than 100,000 ground facts of the twenty predicates on protein domains, and various DDI findings that are evaluated to be significant. Experimental results show that ILP is more appropriate to this learning problem than several other methods. Also, many predictive rules associated with domain sites, conserved motifs, protein functions and biological pathways were found.


Inductive Logic Programming Domain Pair Protein Feature Extract Background Knowledge PROSITE Database 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
    Chen, X.W., Liu, M.: Prediction of protein-protein interactions using random decision forest framework. Bioinformatics 21(24), 4394–4400 (2005)CrossRefGoogle Scholar
  3. 3.
    Comprehensive Yeast Genome Database,
  4. 4.
    InterPro database concerning protein families and domains,
  5. 5.
    Deng, M., Mehta, S., Sun, F., Chen, T.: Inferring domain-domain interactions from protein-protein interactions. Genome Res. 12(10), 1540–1548 (2002)CrossRefGoogle Scholar
  6. 6.
    Protein families database of alignments and HMMs,
  7. 7.
  8. 8.
    Han, D., Kim, H.S., Seo, J., Jang, W.: A domain combination based probabilistic framework for protein - protein interaction prediction. In: Genome Inform. Ser. Workshop Genome Inform, pp. 250–259 (2003)Google Scholar
  9. 9.
  10. 10.
    Kim, R.M., Park, J., Suh, J.K.: Large scale statistical prediction of protein - protein interaction by potentially interacting domain (PID) pair. In: Genome Inform. Ser. Workshop Genome Inform, pp. 48–50 (2002)Google Scholar
  11. 11.
    Moon, H.S., Bhak, J., Lee, K.H., Lee, D.: Architecture of basic building blocks in protein and domain structural interaction networks. Bioinformatics 21(8), 1479–1486 (2005)CrossRefGoogle Scholar
  12. 12.
    Turcotte, M., Muggleton, S.H., Sternberg, M.J.E.: Protein fold recognition. In: Proc. of the 8th International Workshop on Inductive Logic Programming (ILP 1998), pp. 53–64 (1998)Google Scholar
  13. 13.
    Muggleton, S., King, R.D., Sternberg, M.J.E.: Protein secondary structure prediction using logic-based machine learning. Protein Eng. 6(5), 549 (1993)CrossRefGoogle Scholar
  14. 14.
    Ng, S.K., Tan, S.H.: Discovering protein-protein interactions. Journal of Bioinformatics and Computational Biology 1(4), 711–741 (2003)CrossRefGoogle Scholar
  15. 15.
    Ng, S.K., Zhang, Z., Tan, S.H.: Integrative approach for computationally inferring protein domain interactions. Bioinformatics 19(8), 923–929 (2003)CrossRefGoogle Scholar
  16. 16.
    Ng, S.K., Zhang, Z., Tan, S.H., Lin, K.: InterDom: A database of putative interacting protein domains for validating predicted protein interactions and complexes. Nucleic Acids Res. 31(1), 251–254 (2003)CrossRefGoogle Scholar
  17. 17.
    Database of Interacting Proteins,
  18. 18.
    PROSITE: Database of protein families and domains,
  19. 19.
  20. 20.
    Reichmann, D., Rahat, O., Albeck, S., Meged, R., Dym, O., Schreiber, G.: From The Cover: The modular architecture of protein-protein binding interfaces. PNAS 102(1), 57–62 (2005)CrossRefGoogle Scholar
  21. 21.
    Universal Protein Resource,
  22. 22.
    Riley, R., Lee, C., Sabatti, C., Eisenberg, D.: Inferring protein domain interactions from databases of interacting proteins. Genome Biology 6(10), R89 (2005)Google Scholar
  23. 23.
    Sprinzak, E., Margalit, H.: Correlated sequence-signatures as markers of protein-protein interaction. Journal of Molecular Biology 311(4), 681–692 (2001)CrossRefGoogle Scholar
  24. 24.
    Tran, T.N., Satou, K., Ho, T.-B.: Using inductive logic programming for predicting protein-protein interactions from multiple genomic data. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS, vol. 3721, pp. 321–330. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  25. 25.
    Wilson, K., Walker, J.: Principle and Techniques of Biochemistry and Molecular Biology, 6th edn. Cambridge University Press, Cambridge (2005)Google Scholar
  26. 26.
    Wojcik, J., Schachter, V.: Protein-protein interaction map inference using interacting domain profile pairs. Bioinformatics 17(suppl-1), S296–S305 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Thanh Phuong Nguyen
    • 1
  • Tu Bao Ho
    • 1
  1. 1.School of Knowledge ScienceJapan Advanced Institute of Science and TechnologyNomi, IshikawaJapan

Personalised recommendations