Abstract
Predicting the sensory properties of compounds is challenging due to the subjective nature of the experimental measurements. This testing relies on a panel of human participants and is therefore also expensive and time-consuming. We describe the application of a state-of-the-art deep learning method, Alchemite™, to the imputation of sparse physicochemical and sensory data and compare the results with conventional quantitative structure–activity relationship methods and a multi-target graph convolutional neural network. The imputation model achieved a substantially higher accuracy of prediction, with improvements in R2 between 0.26 and 0.45 over the next best method for each sensory property. We also demonstrate that robust uncertainty estimates generated by the imputation model enable the most accurate predictions to be identified and that imputation also more accurately predicts activity cliffs, where small changes in compound structure result in large changes in sensory properties. In combination, these results demonstrate that the use of imputation, based on data from less expensive, early experiments, enables better selection of compounds for more costly studies, saving experimental time and resources.
Similar content being viewed by others
Data availability
We are unable to publish the full data set used in this study due to the proprietary nature and high cost of the data. Nevertheless, we believe that the first demonstration of the potential for deep learning imputation to make accurate predictions of such challenging in vivo properties warrants publication. To ensure reproducibility and enable comparisons with other methods, we have previously published benchmarking studies accompanied with public domain data sets including all predicted values (Whitehead et al. [33]; Irwin et al. [35]).
Code availability
The code used in this study is proprietary.
References
Kass M, Rosenthal M, Pottackal J, McGann J (2013) Fear learning enhances neural responses to threat-predictive sensory stimuli. Science 342:1389–1392
Block E (2018) Molecular basis of mammalian odor discrimination: a status report. J Agric Food Chem 66:13346–13366
McGann J (2017) Poor human olfaction is a nineteenth century myth. Science 356:7263
Genva M, Kemene T, Deleu M, Lins L, Fauconnier M (2019) Is it possible to predict the odor of a molecule on the basis of its structure? Int J Mol Sci 20:3018
Buck L (2000) The molecular architecture of odor and pheromone sensing in mammals. Cell 100:611–618
Nara K, Saraiva L, Ye X, Buck L (2011) A large-scale analysis of odor coding in the olfactory epithelium. J Neurosci 31:9179–9191
Araneda R, Kini A, Firestein S (2000) The molecular receptive range of an odorant receptor. Nat Neurosci 3:1248–1255
Yeshurun Y, Sobel N (2010) An odor is not worth a thousand words: from multidimensional odors to unidimensional odor objects. Annu Rev Psychol 61:219–241
Zufall F, Leinders-Zufall T (2000) The cellular and molecular basis of odor adaptation. Chem Senses 25:473–481
Kraft P (2018) The odor value concept in the formal analysis of olfactory art. Helvetica 102:e1800185
Dunkel A, Steinhaus M, Kotthoff M, Nowak B, Krautwurst D, Schieberie P, Hoffmann T (2014) Nature’s chemical signatures in human olfaction: a foodborne perspective for future biotechnology. Angew Chem Int Ed 53:7124–7143
Rossiter K (1996) Structure-odor relationships. Chem Rev 96:3201–3240
Kraft P, Bajgrowicz J, Denis C, Frater G (2000) Odds and trends: recent developments in the chemistry of odorants. Angew Chem Int Ed 39:2980–3010
Kraft P, Di Cristofaro V, Jordi A (2014) From cassyrane to cashmeran—the molecular parameters of odorants. Chem Biodiver 11:1567–1596
Zhan W, Doro F, Teixeira M (2019) A rapid approach to optimize the design of fragrances for fabric care products. Flavor Frag J 35:167–173
Trimmer C, Keller A, Murphy N, Snyder L, Willer J, Nagai M, Katsanis N, Vosshall L, Matsunami H, Mainland J (2019) Genetic variation across the human olfactory receptor repertoire alters odor perception. PNAS 116:9575–9580
Teixeria M, Barrault L, Rodriguez O, Carvalho C, Rodrigues A (2014) Perfumery radar 2.0: a step toward fragrance design and classification. Ind Eng Chem Res 53:8890–8912
Ruddigkeit L, Awale M, Reymond J (2014) Expanding the fragrance chemical space for virtual screening. J Cheminform 6:27
Medino-Franco J, Martinez-Mayorga K, Peppard T, Del Rio A (2012) Chemoinformatic analysis of GRAS (generally recognized as safe) flavor chemicals and natural products. PLoS ONE 7:e50798
Brenna E, Fuganti C, Serra S (2003) Enantioselective perception of chiral odorants. Tetrahedron Asymmetry 14:1–42
Schleyer P, Allinger N, Clark T, Gasteiger J, Kollman P, Schaefer H, Schreiner P (eds) (1998) Encyclopedia of computational chemistry. Wiley, Chichester
Breiman L (2001) Random forests. Mach Learn 45:5–32
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge
Hunt P, Hosseini-Gerami L, Chrien T, Plante J, Ponting D, Segall M (2020) Predicting pKa using a combination of semi-empirical quantum mechanics and radial basis function methods. J Chem Inf Model 60:2989–2997
Obrezanova O, Csanyi G, Gola J, Segall M (2007) Gaussian processes: a method for automatic QSAR modelling of ADME properties. J Chem Inf Model 47:1847–1857
Sadawi N, Olier I, Vanschoren J, van Rijn R, Besnard J, Bickerton R, Grosan C, Soldatova L, King R (2019) Multi-task learning with a natural metric for quantitative structure activity relationship learning. J Cheminform 11:68
Feinberg E, Sur D, Wu Z, Husic B, Mai H, Li Y, Sun S, Yang J, Ramsundar B, Pande V (2018) PotentialNet for molecular property prediction. ACS Cent Sci 4:1520–1530
Nozaki Y, Nakamoto T (2018) Predictive modeling for odor character of a chemical using machine learning combined with natural language processing. PLoS ONE 13:e0198475
Gunaratne T, Gonzalez Viejo C, Gunaratne N, Torrico D, Dunshea F, Fuentes S (2019) Chocolate quality assessment based on chemical fingerprinting using near infra-red and machine learning modeling. Foods 8:426
Dagan-Wiener A, Nissim I, Ben Abu N, Borgonovo G, Bassoli A, Niv M (2017) Bitter or not? BitterPredict, a tool for predicting taste from chemical structure. Sci Rep 7:12074
Shang L, Liu C, Tomiura Y, Hayashi K (2017) Machine-learning-based olfactometer: prediction of odor perception from physicochemical features of odorant molecules. Anal Chem 89:11999–12005
Irwin B, Mahmoud S, Whitehead T, Conduit G, Segall M (2020) Imputation versus prediction: applications in machine learning for drug discovery. Future Drug Discov 2:38
Whitehead T, Irwin B, Hunt PSM, Conduit G (2019) Imputation of assay bioactivity data using deep learning. J Chem Inf Model 59:1197–1204
Irwin B, Levell J, Whitehead T, Segall M, Conduit G (2020) Practical applications of deep learning to impute heterogeneous drug discovery data. J Chem Inf Model 60:2848–2857
Irwin B, Whitehead T, Rowland S, Mahmoud S, Conduit G, Segall M (2021) Deep imputation on large-scale drug discovery data. Appl. AI Lett. 2:e31
Segall M, Champness E (2015) The challenges of making decisions using uncertain data. J Comp-Aided Mol Des 29:809–816
Hirschfeld L, Swanson K, Yang K, Barzilay R, Coley C (2020) Uncertainty quantification using neural networks for molecular property prediction. J Chem Inf Model 60:3770–3780
Verpoort PC, MacDonald P, Conduit GJ (2018) Materials data validation and imputation with an artificial neural network. Comput Mater Sci 147:176–185
Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) NIPS’11: proceedings of the 24th international conference on neural information processing. Red Hook, New York
Bergstra J, Komer B, Eliasmith C, Yamins D, Cox DD (2015) Hyperopt: a python library for model selection and hyperparameter optimization. Comput Sci Discov 8:014008
Optibrium Ltd. “StarDrop,” [Online]. https://www.optibrium.com/stardrop. Accessed 27 Sept 2021
Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, Guzman-Perez A, Hopper T, Kelley B, Mathea M, Palmer A, Settels V, Jaakkola T, Jensen K, Barzilay R (2019) Analyzing learned molecular representations for property prediction. J Chem Inf Model 59:3370–3388
Green G, Dalton P, Cowart B, Shaffer G, Rankin K, Higgins J (1996) Evaluating the “labeled magnitude scale” for measuring sensations of taset and smell. Chem Senses 21:323–334
ASTM International (2019) ASTM E679-19, standard practice for determination of odor and taste thresholds by a forced-choice ascending concentration series method of limits. ASTM International, West Conshohocken
Funding
GC would like to acknowledge financial support from the Royal Society.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Data curation was performed by DC; modelling and analysis of the results was performed by SM; the Alchemite method was developed by GC and TW. The first draft of the manuscript was written primarily by SM with contributions from all authors. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
SM, BI, TM and MS are employees of Optibrium Limited. DM, SV, JK and JB are employees of International Flavors & Fragrances, Inc. TW and GC are employees of Intellegens Limited.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mahmoud, S., Irwin, B., Chekmarev, D. et al. Imputation of sensory properties using deep learning. J Comput Aided Mol Des 35, 1125–1140 (2021). https://doi.org/10.1007/s10822-021-00424-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-021-00424-3