Abstract
The Maximum Common Substructure (MCS) between two molecules induces a similarity that makes it possible to group compounds sharing the same pattern. In our study the relevance of a similarity measure exclusively based on MCS has been implemented in new software based on the fmcs_R package. The newly developed program searches for the largest substructures between a target molecule, with unknown property value, and a set of similar molecules with experimental value to assess the toxicity of the target chemical. In QSAR and read-across , while reasoning on the similarity of the evaluated molecules, another important aspect to consider is the difference of two molecules that share a large common part. Thus, the present study examines the issue of the MCS itself, and the differences between a reference and a similar molecule by the aid of an ad hoc developed software. The most important features of this software are: (I) the process of the MCSs between two molecules represented as graphs and (II) the detection and the graphical representation of the dissimilar substructures that are identified in the target and the source molecules. The user may consequently quantify the properties and weights of these substructures to improve the assessment of new substances. This new software is integrated into ToxRead, a system to visualize structures and substructures for expert reasoning. Moreover, an automatic search in a database containing the role of small substructures in amplifying or reducing the property can help in improving the final assessment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Benfenati, E., Belli, M., Borges, T., Casimiro, E., Cester, J., Fernandez, A., et al. (2016). Results of a round-robin exercise on read-across. SAR and QSAR in Environmental Research, 27(5), 371–384. doi:10.1080/1062936X.2016.1178171.
Bron, C., & Kerbosch, J. (1973). Finding all the cliques in an undirected graph. Communication of the Association for Computing Machinery (ACM), 16(9), 189–201.
Bunke, H., & Messmer, B. T. (1995). Efficient attributed graph matching and its application to image analysis. In Proceeding of Image Analysis and Processing (pp. 45–55). doi:10.1007/3-540-60298-4_235.
Cao, Y., Jiang, T., & Girke, T. (2008). A maximum common substructure-based algorithm for searching and predicting drug-like compounds. Bioinformatics, 24(13), 366–374. doi:10.1093/bioinformatics/btn186.
Cordella, L. P., Foggia, P., Sansone, C., & Vento, M. (2004). A (sub)graph isomorphism algorithm for matching large graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(10), 1367–1372.
Cuissart, B., Touffet, F., Cremilleux, B., Bureau, R., & Rault, S. (2002). The maximum common substructure as a molecular depiction in a supervised classification context: Experiments in quantitative structure/biodegradability relationships. Journal of Chemical Information and Modelling, 42(5), 1043–1052. doi:10.1021/ci020017w.
Duesbury, E., Holliday, J., & Willett, P. (2015). Maximum common substructure-based data fusion in similarity searching. Journal of Chemical Information and Modelling, 55(2), 222–230.
Englert, P., & Kovacs, P. (2015). Efficient heuristics for maximum common substructure search. Journal of Chemical Information and Modelling, 55(5), 941–955.
Garey, M. R., & Johnson, D. S. (1979). Computers and intractability: A guide to the theory of NP-completeness. : W.H. Freeman. ISBN 0-7167-1045-5.
Gini, G., Ferrari, T., Cattaneo, D., Golbamaki, N., Manganaro, A., & Benfenati, E. (2013). Automatic knowledge extraction from chemical structures: The case of mutagenicity prediction. SAR and QSAR in Environmental Research, 24(5), 365–383.
Gini, G., Franchi, A. M., Manganaro, A., Golbamaki, A., & Benfenati, E. (2014). ToxRead: A tool to assist in read across and its use to assess mutagenicity of chemicals. SAR and QSAR in Environmental Research, 25(12), 999–1011.
Gini, G., Lorenzini, M., Benfenati, E., Brambilla, R., & Malvè, L. (2001). Mixing a symbolic and a subsymbolic expert to improve carcinogenicity prediction of aromatic compounds. In Proceedings of the Second International Workshop on Multiple Classifier Systems (MCS 2001), July 2001 (pp. 126–135). Cambridge (UK): Springer.
Hansch, C., & Leo, A. (1979). Substituent constants for correlation analysis in chemistry and biology. New York: Wiley.
Kuhl, F. S., Crippen, G. M., & Friesen, D. K. (1983). A combinatorial algorithm for calculating ligand binding. Journal of Computational Chemistry, 5(1), 24–34.
Moon, J. W., & Moser, L. (1965). On cliques in graphs. Israel Journal of Mathematics, 3(1), 23–28. doi:10.1007/BF02760024.
Reymond, J.-L., & Awale, M. (2012). Exploring chemical space for drug discovery using the chemical universe database. ACS Chemical Neuroscience, 3(9), 649–657. doi:10.1021/cn3000422, PMID: 23019491.
Raymod, J. W., Gardiner, E. J., & Willet, P. (2002). RASCAL: Calculation of graph similarity using maximum common edge subgraphs. The Computer Journal, 45(6), 631–644.
Raymond, J. W., & Willett, P. (2002). Maximum common subgraph isomorphism algorithms for the matching of chemical structures. Journal of Computer-Aided Molecular Design, 16(7), 521–533.
Rhodes, N., Willett, P., Calvet, A., Dunbar, J. B., & Humblet, C. (2003). CLIP: Similarity searching of 3D databases using clique detection. Journal of Chemical Information and Computer Science, 43(2), 443–448.
Stah, M., Mauser, H., & Hoffmann, F. (2005). Database clustering with a combination of fingerprint and maximum common substructure methods. Journal of Chemical Information and Computer Science., 45(3), 542–548.
Tanimoto, T. (1958). An elementary mathematical theory of classification and prediction. Internal IBM Technical Report.
Toropov, A. P., Toropov, A. A., Lombardo, A., Roncaglioni, A., Benfenati, E., & Gini, G. (2010). A new bioconcentration factor model based on SMILES and indices of presence of atoms. European Journal of Medicinal Chemistry, 45(9), 4399–4402.
Ullmann, J. R. (1976). An algorithm for subgraph isomorphism. Journal of the ACM, 23(1), 31–42.
Xu, J. (1996). GMA: A generic match algorithm for structural homomorphism, isomorphism, and maximal common substructure match and its applications. Journal of Chemical Information and Computer Science., 36(1), 25–34.
Zhu, Y., Oin, L., & Yu, J. X. (2013). High efficiency and quality: Large graphs matching. The International Journal on Very Large Data Bases, 22(3), 345–368.
Acknowledgements
This research was supported by the PROSIL project (LIFE12 ENV/IT/000154). We thank Serena Manganelli and Giuseppa Raitano from the IRCCS—Istituto di Ricerca Farmacologiche Mario Negri, who provided insight and expertise that greatly assisted the research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Golbamaki, A., Franchi, A.M., Gini, G. (2017). The Maximum Common Substructure (MCS) Search as a New Tool for SAR and QSAR. In: Roy, K. (eds) Advances in QSAR Modeling. Challenges and Advances in Computational Chemistry and Physics, vol 24. Springer, Cham. https://doi.org/10.1007/978-3-319-56850-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-56850-8_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56849-2
Online ISBN: 978-3-319-56850-8
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)