Skip to main content

Imputation of sensory properties using deep learning

Abstract

Predicting the sensory properties of compounds is challenging due to the subjective nature of the experimental measurements. This testing relies on a panel of human participants and is therefore also expensive and time-consuming. We describe the application of a state-of-the-art deep learning method, Alchemite™, to the imputation of sparse physicochemical and sensory data and compare the results with conventional quantitative structure–activity relationship methods and a multi-target graph convolutional neural network. The imputation model achieved a substantially higher accuracy of prediction, with improvements in R2 between 0.26 and 0.45 over the next best method for each sensory property. We also demonstrate that robust uncertainty estimates generated by the imputation model enable the most accurate predictions to be identified and that imputation also more accurately predicts activity cliffs, where small changes in compound structure result in large changes in sensory properties. In combination, these results demonstrate that the use of imputation, based on data from less expensive, early experiments, enables better selection of compounds for more costly studies, saving experimental time and resources.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Data availability

We are unable to publish the full data set used in this study due to the proprietary nature and high cost of the data. Nevertheless, we believe that the first demonstration of the potential for deep learning imputation to make accurate predictions of such challenging in vivo properties warrants publication. To ensure reproducibility and enable comparisons with other methods, we have previously published benchmarking studies accompanied with public domain data sets including all predicted values (Whitehead et al. [33]; Irwin et al. [35]).

Code availability

The code used in this study is proprietary.

References

  1. Kass M, Rosenthal M, Pottackal J, McGann J (2013) Fear learning enhances neural responses to threat-predictive sensory stimuli. Science 342:1389–1392

    CAS  PubMed  PubMed Central  Google Scholar 

  2. Block E (2018) Molecular basis of mammalian odor discrimination: a status report. J Agric Food Chem 66:13346–13366

    CAS  PubMed  Google Scholar 

  3. McGann J (2017) Poor human olfaction is a nineteenth century myth. Science 356:7263

    Google Scholar 

  4. Genva M, Kemene T, Deleu M, Lins L, Fauconnier M (2019) Is it possible to predict the odor of a molecule on the basis of its structure? Int J Mol Sci 20:3018

    CAS  PubMed Central  Google Scholar 

  5. Buck L (2000) The molecular architecture of odor and pheromone sensing in mammals. Cell 100:611–618

    CAS  PubMed  Google Scholar 

  6. Nara K, Saraiva L, Ye X, Buck L (2011) A large-scale analysis of odor coding in the olfactory epithelium. J Neurosci 31:9179–9191

    CAS  PubMed  PubMed Central  Google Scholar 

  7. Araneda R, Kini A, Firestein S (2000) The molecular receptive range of an odorant receptor. Nat Neurosci 3:1248–1255

    CAS  PubMed  Google Scholar 

  8. Yeshurun Y, Sobel N (2010) An odor is not worth a thousand words: from multidimensional odors to unidimensional odor objects. Annu Rev Psychol 61:219–241

    PubMed  Google Scholar 

  9. Zufall F, Leinders-Zufall T (2000) The cellular and molecular basis of odor adaptation. Chem Senses 25:473–481

    CAS  PubMed  Google Scholar 

  10. Kraft P (2018) The odor value concept in the formal analysis of olfactory art. Helvetica 102:e1800185

    Google Scholar 

  11. Dunkel A, Steinhaus M, Kotthoff M, Nowak B, Krautwurst D, Schieberie P, Hoffmann T (2014) Nature’s chemical signatures in human olfaction: a foodborne perspective for future biotechnology. Angew Chem Int Ed 53:7124–7143

    CAS  Google Scholar 

  12. Rossiter K (1996) Structure-odor relationships. Chem Rev 96:3201–3240

    CAS  PubMed  Google Scholar 

  13. Kraft P, Bajgrowicz J, Denis C, Frater G (2000) Odds and trends: recent developments in the chemistry of odorants. Angew Chem Int Ed 39:2980–3010

    CAS  Google Scholar 

  14. Kraft P, Di Cristofaro V, Jordi A (2014) From cassyrane to cashmeran—the molecular parameters of odorants. Chem Biodiver 11:1567–1596

    CAS  Google Scholar 

  15. Zhan W, Doro F, Teixeira M (2019) A rapid approach to optimize the design of fragrances for fabric care products. Flavor Frag J 35:167–173

    Google Scholar 

  16. Trimmer C, Keller A, Murphy N, Snyder L, Willer J, Nagai M, Katsanis N, Vosshall L, Matsunami H, Mainland J (2019) Genetic variation across the human olfactory receptor repertoire alters odor perception. PNAS 116:9575–9580

    Google Scholar 

  17. Teixeria M, Barrault L, Rodriguez O, Carvalho C, Rodrigues A (2014) Perfumery radar 2.0: a step toward fragrance design and classification. Ind Eng Chem Res 53:8890–8912

    Google Scholar 

  18. Ruddigkeit L, Awale M, Reymond J (2014) Expanding the fragrance chemical space for virtual screening. J Cheminform 6:27

    PubMed  PubMed Central  Google Scholar 

  19. Medino-Franco J, Martinez-Mayorga K, Peppard T, Del Rio A (2012) Chemoinformatic analysis of GRAS (generally recognized as safe) flavor chemicals and natural products. PLoS ONE 7:e50798

    Google Scholar 

  20. Brenna E, Fuganti C, Serra S (2003) Enantioselective perception of chiral odorants. Tetrahedron Asymmetry 14:1–42

    CAS  Google Scholar 

  21. Schleyer P, Allinger N, Clark T, Gasteiger J, Kollman P, Schaefer H, Schreiner P (eds) (1998) Encyclopedia of computational chemistry. Wiley, Chichester

    Google Scholar 

  22. Breiman L (2001) Random forests. Mach Learn 45:5–32

    Google Scholar 

  23. Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge

    Google Scholar 

  24. Hunt P, Hosseini-Gerami L, Chrien T, Plante J, Ponting D, Segall M (2020) Predicting pKa using a combination of semi-empirical quantum mechanics and radial basis function methods. J Chem Inf Model 60:2989–2997

    CAS  PubMed  Google Scholar 

  25. Obrezanova O, Csanyi G, Gola J, Segall M (2007) Gaussian processes: a method for automatic QSAR modelling of ADME properties. J Chem Inf Model 47:1847–1857

    CAS  PubMed  Google Scholar 

  26. Sadawi N, Olier I, Vanschoren J, van Rijn R, Besnard J, Bickerton R, Grosan C, Soldatova L, King R (2019) Multi-task learning with a natural metric for quantitative structure activity relationship learning. J Cheminform 11:68

    PubMed  PubMed Central  Google Scholar 

  27. Feinberg E, Sur D, Wu Z, Husic B, Mai H, Li Y, Sun S, Yang J, Ramsundar B, Pande V (2018) PotentialNet for molecular property prediction. ACS Cent Sci 4:1520–1530

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Nozaki Y, Nakamoto T (2018) Predictive modeling for odor character of a chemical using machine learning combined with natural language processing. PLoS ONE 13:e0198475

    PubMed  PubMed Central  Google Scholar 

  29. Gunaratne T, Gonzalez Viejo C, Gunaratne N, Torrico D, Dunshea F, Fuentes S (2019) Chocolate quality assessment based on chemical fingerprinting using near infra-red and machine learning modeling. Foods 8:426

    CAS  PubMed Central  Google Scholar 

  30. Dagan-Wiener A, Nissim I, Ben Abu N, Borgonovo G, Bassoli A, Niv M (2017) Bitter or not? BitterPredict, a tool for predicting taste from chemical structure. Sci Rep 7:12074

    PubMed  PubMed Central  Google Scholar 

  31. Shang L, Liu C, Tomiura Y, Hayashi K (2017) Machine-learning-based olfactometer: prediction of odor perception from physicochemical features of odorant molecules. Anal Chem 89:11999–12005

    CAS  PubMed  Google Scholar 

  32. Irwin B, Mahmoud S, Whitehead T, Conduit G, Segall M (2020) Imputation versus prediction: applications in machine learning for drug discovery. Future Drug Discov 2:38

    Google Scholar 

  33. Whitehead T, Irwin B, Hunt PSM, Conduit G (2019) Imputation of assay bioactivity data using deep learning. J Chem Inf Model 59:1197–1204

    CAS  PubMed  Google Scholar 

  34. Irwin B, Levell J, Whitehead T, Segall M, Conduit G (2020) Practical applications of deep learning to impute heterogeneous drug discovery data. J Chem Inf Model 60:2848–2857

    CAS  PubMed  Google Scholar 

  35. Irwin B, Whitehead T, Rowland S, Mahmoud S, Conduit G, Segall M (2021) Deep imputation on large-scale drug discovery data. Appl. AI Lett. 2:e31

    Google Scholar 

  36. Segall M, Champness E (2015) The challenges of making decisions using uncertain data. J Comp-Aided Mol Des 29:809–816

    CAS  Google Scholar 

  37. Hirschfeld L, Swanson K, Yang K, Barzilay R, Coley C (2020) Uncertainty quantification using neural networks for molecular property prediction. J Chem Inf Model 60:3770–3780

    CAS  PubMed  Google Scholar 

  38. Verpoort PC, MacDonald P, Conduit GJ (2018) Materials data validation and imputation with an artificial neural network. Comput Mater Sci 147:176–185

    CAS  Google Scholar 

  39. Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) NIPS’11: proceedings of the 24th international conference on neural information processing. Red Hook, New York

    Google Scholar 

  40. Bergstra J, Komer B, Eliasmith C, Yamins D, Cox DD (2015) Hyperopt: a python library for model selection and hyperparameter optimization. Comput Sci Discov 8:014008

    Google Scholar 

  41. Optibrium Ltd. “StarDrop,” [Online]. https://www.optibrium.com/stardrop. Accessed 27 Sept 2021

  42. Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, Guzman-Perez A, Hopper T, Kelley B, Mathea M, Palmer A, Settels V, Jaakkola T, Jensen K, Barzilay R (2019) Analyzing learned molecular representations for property prediction. J Chem Inf Model 59:3370–3388

    CAS  PubMed  PubMed Central  Google Scholar 

  43. Green G, Dalton P, Cowart B, Shaffer G, Rankin K, Higgins J (1996) Evaluating the “labeled magnitude scale” for measuring sensations of taset and smell. Chem Senses 21:323–334

    CAS  PubMed  Google Scholar 

  44. ASTM International (2019) ASTM E679-19, standard practice for determination of odor and taste thresholds by a forced-choice ascending concentration series method of limits. ASTM International, West Conshohocken

    Google Scholar 

Download references

Funding

GC would like to acknowledge financial support from the Royal Society.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Data curation was performed by DC; modelling and analysis of the results was performed by SM; the Alchemite method was developed by GC and TW. The first draft of the manuscript was written primarily by SM with contributions from all authors. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Samar Mahmoud.

Ethics declarations

Conflict of interest

SM, BI, TM and MS are employees of Optibrium Limited. DM, SV, JK and JB are employees of International Flavors & Fragrances, Inc. TW and GC are employees of Intellegens Limited.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Mahmoud, S., Irwin, B., Chekmarev, D. et al. Imputation of sensory properties using deep learning. J Comput Aided Mol Des 35, 1125–1140 (2021). https://doi.org/10.1007/s10822-021-00424-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-021-00424-3

Keywords

  • Sensory properties
  • In silico model
  • Deep learning
  • Imputation
  • Quantitative structure–activity relationship