Interactive Discriminative Mining of Chemical Fragments

  • Nuno A. Fonseca
  • Max Pereira
  • Vítor Santos Costa
  • Rui Camacho
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6489)


Structural activity prediction is one of the most important tasks in chemoinformatics. The goal is to predict a property of interest given structural data on a set of small compounds or drugs. Ideally, systems that address this task should not just be accurate, but they should also be able to identify an interpretable discriminative structure which describes the most discriminant structural elements with respect to some target.

The application of ILP in an interactive software for discriminative mining of chemical fragments is presented in this paper. In particular, it is described the coupling of an ILP system with a molecular visualisation software that allows a chemist to graphically control the search for interesting patterns in chemical fragments. Furthermore, we show how structural information, such as rings, functional groups such as carboxyls, amines, methyls, and esters, are integrated and exploited in the search.


Drug design graphical mining efficiency 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Page, D.L.: ILP: Just do it. In: Cussens, J., Frisch, A.M. (eds.) ILP 2000. LNCS (LNAI), vol. 1866, pp. 3–18. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  2. 2.
    Humphrey, W., Dalke, A., Schulten, K.: VMD – Visual Molecular Dynamics. Journal of Molecular Graphics 14, 33–38 (1996)CrossRefGoogle Scholar
  3. 3.
    Costa, V.S., Fonseca, N.A., Camacho, R.: LogCHEM: Interactive Discriminative Mining of Chemical Structure. In: Proceedings of 2008 IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2008), pp. 421–426. IEEE Computer Society, Philadelphia (2008)CrossRefGoogle Scholar
  4. 4.
    Collins, J.M.: The DTP AIDS antiviral screen program (1999),
  5. 5.
    Maggiora, G.M., Shanmugasundaram, V., Lajiness, M.J., Doman, T.N., Schultz, M.W.: A practical strategy for directed compound acquisition, pp. 315–332. Wiley-VCH, Chichester (2004)Google Scholar
  6. 6.
    Karwath, A., De Raedt, L.: Predictive Graph Mining. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 1–15. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  7. 7.
    Chittimoori, R.N., Holder, L.B., Cook, D.J.: Holder, and Diane J. Cook. Applying the subdue substructure discovery system to the chemical toxicity domain. In: Kumar, A.N., Russell, I. (eds.) Proceedings of the Twelfth International Florida Artificial Intelligence Research Society Conference, Orlando, Florida, USA, May 1-5, pp. 90–94. AAAI Press, Menlo Park (1999)Google Scholar
  8. 8.
    Borgelt, C., Berthold, M.R.: Mining molecular fragments: Finding relevant substructures of molecules. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), Japan, pp. 51–58 (2002)Google Scholar
  9. 9.
    Yan, X., Han, J.: gspan: Graph-based substructure pattern mining. In: Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM 2002), Maebashi City, Japan, December 9-12 (2002)Google Scholar
  10. 10.
    Huan, J., Wang, W., Prins, J.: Efficient mining of frequent subgraphs in the presence of isomorphism. In: Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM 2003), Melbourne, Florida, USA, December 19-22, pp. 549–552. IEEE Computer Society, Los Alamitos (2003)Google Scholar
  11. 11.
    Nijssen, S., Kok, J.N.: Frequent graph mining and its application to molecular databases. In: Proceedings of the IEEE International Conference on Systems, Man & Cybernetics, The Hague, Netherlands, October 10-13, pp. 4571–4577. IEEE, Los Alamitos (2004)Google Scholar
  12. 12.
    Maunz, A., Helma, C., Kramer, S.: Large-scale graph mining using backbone refinement classes. In: KDD, pp. 617–626 (2009)Google Scholar
  13. 13.
    Kramer, S., De Raedt, L., Helma, C.: Molecular feature mining in hiv data. In: KDD, NY, USA, pp. 136–143 (2001)Google Scholar
  14. 14.
    Guha, R., Howard, M.T., Hutchison, G.R., Murray-Rust, P., Rzepa, H., Steinbeck, C., Wegner, J.K., Willighagen, E.L.: The Blue Obelisk–Interoperability in Chemical Informatics. Journal of Chemical Information and Modeling 46, 991–998 (2006)CrossRefGoogle Scholar
  15. 15.
    Richard, A.M., Williams, C.R.: Distributed structure-searchable toxicity (dsstox) public database network: a proposal. Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis 499, 27–52(26) (2002)CrossRefGoogle Scholar
  16. 16.
    Srinivasan, A.: The Aleph Manual. University of Oxford (2004),
  17. 17.
    Muggleton, S.: Inverse entailment and Progol. New Generation Computing, Special issue on Inductive Logic Programming 13(3-4), 245–286 (1995)Google Scholar
  18. 18.
    Fonseca, N.A., Silva, F., Camacho, R.: April – An Inductive Logic Programming System. In: Fisher, M., van der Hoek, W., Konev, B., Lisitsa, A. (eds.) JELIA 2006. LNCS (LNAI), vol. 4160, pp. 481–484. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  19. 19.
    Lisi, F.A., Ferilli, S., Fanizzi, N.: Object identity as search bias for pattern spaces. In: van Harmelen, F. (ed.) Proceedings of the 15th Eureopean Conference on Artificial Intelligence, ECAI 2002, pp. 375–379. IOS Press, Amsterdam (2002)Google Scholar
  20. 20.
    Page, D., Srinivasan, A.: ILP: A short look back and a longer look forward. Journal of Machine Learning Research 4, 415–430 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Nuno A. Fonseca
    • 1
  • Max Pereira
    • 1
    • 2
  • Vítor Santos Costa
    • 3
  • Rui Camacho
    • 2
  1. 1.CRACS-INESC Porto LAUniversidade do PortoPortoPortugal
  2. 2.LIAAD-INESC Porto LA & DEI-FEUPUniversidade do PortoPortoPortugal
  3. 3.CRACS-INESC Porto LA & DCC-FCUPUniversidade do PortoPortoPortugal

Personalised recommendations