The Maximum Common Substructure (MCS) Search as a New Tool for SAR and QSAR

Golbamaki, Azadi; Franchi, Alessio Mauro; Gini, Giuseppina

doi:10.1007/978-3-319-56850-8_5

Azadi Golbamaki³,
Alessio Mauro Franchi⁴ &
Giuseppina Gini⁴

Part of the book series: Challenges and Advances in Computational Chemistry and Physics ((COCH,volume 24))

1632 Accesses

Abstract

The Maximum Common Substructure (MCS) between two molecules induces a similarity that makes it possible to group compounds sharing the same pattern. In our study the relevance of a similarity measure exclusively based on MCS has been implemented in new software based on the fmcs_R package. The newly developed program searches for the largest substructures between a target molecule, with unknown property value, and a set of similar molecules with experimental value to assess the toxicity of the target chemical. In QSAR and read-across , while reasoning on the similarity of the evaluated molecules, another important aspect to consider is the difference of two molecules that share a large common part. Thus, the present study examines the issue of the MCS itself, and the differences between a reference and a similar molecule by the aid of an ad hoc developed software. The most important features of this software are: (I) the process of the MCSs between two molecules represented as graphs and (II) the detection and the graphical representation of the dissimilar substructures that are identified in the target and the source molecules. The user may consequently quantify the properties and weights of these substructures to improve the assessment of new substances. This new software is integrated into ToxRead, a system to visualize structures and substructures for expert reasoning. Moreover, an automatic search in a database containing the role of small substructures in amplifying or reducing the property can help in improving the final assessment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 349.00; Price excludes VAT (USA)

Softcover Book: USD 449.99; Price excludes VAT (USA)

Hardcover Book: USD 449.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Benfenati, E., Belli, M., Borges, T., Casimiro, E., Cester, J., Fernandez, A., et al. (2016). Results of a round-robin exercise on read-across. SAR and QSAR in Environmental Research, 27(5), 371–384. doi:10.1080/1062936X.2016.1178171.
Article CAS Google Scholar
Bron, C., & Kerbosch, J. (1973). Finding all the cliques in an undirected graph. Communication of the Association for Computing Machinery (ACM), 16(9), 189–201.
Google Scholar
Bunke, H., & Messmer, B. T. (1995). Efficient attributed graph matching and its application to image analysis. In Proceeding of Image Analysis and Processing (pp. 45–55). doi:10.1007/3-540-60298-4_235.
Cao, Y., Jiang, T., & Girke, T. (2008). A maximum common substructure-based algorithm for searching and predicting drug-like compounds. Bioinformatics, 24(13), 366–374. doi:10.1093/bioinformatics/btn186.
Article Google Scholar
Cordella, L. P., Foggia, P., Sansone, C., & Vento, M. (2004). A (sub)graph isomorphism algorithm for matching large graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(10), 1367–1372.
Article Google Scholar
Cuissart, B., Touffet, F., Cremilleux, B., Bureau, R., & Rault, S. (2002). The maximum common substructure as a molecular depiction in a supervised classification context: Experiments in quantitative structure/biodegradability relationships. Journal of Chemical Information and Modelling, 42(5), 1043–1052. doi:10.1021/ci020017w.
CAS Google Scholar
Duesbury, E., Holliday, J., & Willett, P. (2015). Maximum common substructure-based data fusion in similarity searching. Journal of Chemical Information and Modelling, 55(2), 222–230.
Article CAS Google Scholar
Englert, P., & Kovacs, P. (2015). Efficient heuristics for maximum common substructure search. Journal of Chemical Information and Modelling, 55(5), 941–955.
Article CAS Google Scholar
Garey, M. R., & Johnson, D. S. (1979). Computers and intractability: A guide to the theory of NP-completeness. : W.H. Freeman. ISBN 0-7167-1045-5.
Google Scholar
Gini, G., Ferrari, T., Cattaneo, D., Golbamaki, N., Manganaro, A., & Benfenati, E. (2013). Automatic knowledge extraction from chemical structures: The case of mutagenicity prediction. SAR and QSAR in Environmental Research, 24(5), 365–383.
Article Google Scholar
Gini, G., Franchi, A. M., Manganaro, A., Golbamaki, A., & Benfenati, E. (2014). ToxRead: A tool to assist in read across and its use to assess mutagenicity of chemicals. SAR and QSAR in Environmental Research, 25(12), 999–1011.
Article CAS Google Scholar
Gini, G., Lorenzini, M., Benfenati, E., Brambilla, R., & Malvè, L. (2001). Mixing a symbolic and a subsymbolic expert to improve carcinogenicity prediction of aromatic compounds. In Proceedings of the Second International Workshop on Multiple Classifier Systems (MCS 2001), July 2001 (pp. 126–135). Cambridge (UK): Springer.
Google Scholar
Hansch, C., & Leo, A. (1979). Substituent constants for correlation analysis in chemistry and biology. New York: Wiley.
Google Scholar
Kuhl, F. S., Crippen, G. M., & Friesen, D. K. (1983). A combinatorial algorithm for calculating ligand binding. Journal of Computational Chemistry, 5(1), 24–34.
Article Google Scholar
Moon, J. W., & Moser, L. (1965). On cliques in graphs. Israel Journal of Mathematics, 3(1), 23–28. doi:10.1007/BF02760024.
Article Google Scholar
Reymond, J.-L., & Awale, M. (2012). Exploring chemical space for drug discovery using the chemical universe database. ACS Chemical Neuroscience, 3(9), 649–657. doi:10.1021/cn3000422, PMID: 23019491.
Raymod, J. W., Gardiner, E. J., & Willet, P. (2002). RASCAL: Calculation of graph similarity using maximum common edge subgraphs. The Computer Journal, 45(6), 631–644.
Article Google Scholar
Raymond, J. W., & Willett, P. (2002). Maximum common subgraph isomorphism algorithms for the matching of chemical structures. Journal of Computer-Aided Molecular Design, 16(7), 521–533.
Article CAS Google Scholar
Rhodes, N., Willett, P., Calvet, A., Dunbar, J. B., & Humblet, C. (2003). CLIP: Similarity searching of 3D databases using clique detection. Journal of Chemical Information and Computer Science, 43(2), 443–448.
Article CAS Google Scholar
Stah, M., Mauser, H., & Hoffmann, F. (2005). Database clustering with a combination of fingerprint and maximum common substructure methods. Journal of Chemical Information and Computer Science., 45(3), 542–548.
Article Google Scholar
Tanimoto, T. (1958). An elementary mathematical theory of classification and prediction. Internal IBM Technical Report.
Google Scholar
Toropov, A. P., Toropov, A. A., Lombardo, A., Roncaglioni, A., Benfenati, E., & Gini, G. (2010). A new bioconcentration factor model based on SMILES and indices of presence of atoms. European Journal of Medicinal Chemistry, 45(9), 4399–4402.
Article Google Scholar
Ullmann, J. R. (1976). An algorithm for subgraph isomorphism. Journal of the ACM, 23(1), 31–42.
Article Google Scholar
Xu, J. (1996). GMA: A generic match algorithm for structural homomorphism, isomorphism, and maximal common substructure match and its applications. Journal of Chemical Information and Computer Science., 36(1), 25–34.
Article CAS Google Scholar
Zhu, Y., Oin, L., & Yu, J. X. (2013). High efficiency and quality: Large graphs matching. The International Journal on Very Large Data Bases, 22(3), 345–368.
Article Google Scholar

Download references

Acknowledgements

This research was supported by the PROSIL project (LIFE12 ENV/IT/000154). We thank Serena Manganelli and Giuseppa Raitano from the IRCCS—Istituto di Ricerca Farmacologiche Mario Negri, who provided insight and expertise that greatly assisted the research.

Author information

Authors and Affiliations

Istituto di Ricerche Farmacologiche “Mario Negri” Milano, Milan, Italy
Azadi Golbamaki
DEIB, Politecnico di Milano, Milan, Italy
Alessio Mauro Franchi & Giuseppina Gini

Authors

Azadi Golbamaki
View author publications
You can also search for this author in PubMed Google Scholar
Alessio Mauro Franchi
View author publications
You can also search for this author in PubMed Google Scholar
Giuseppina Gini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Azadi Golbamaki .

Editor information

Editors and Affiliations

Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India
Kunal Roy

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Golbamaki, A., Franchi, A.M., Gini, G. (2017). The Maximum Common Substructure (MCS) Search as a New Tool for SAR and QSAR. In: Roy, K. (eds) Advances in QSAR Modeling. Challenges and Advances in Computational Chemistry and Physics, vol 24. Springer, Cham. https://doi.org/10.1007/978-3-319-56850-8_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-56850-8_5
Published: 25 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56849-2
Online ISBN: 978-3-319-56850-8
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)

Publish with us

Policies and ethics