A Machine Learning Application for Classification of Chemical Spectra
This paper presents a software package that allows chemists to analyze spectroscopy data using innovative machine learning (ML) techniques. The package, designed for use in conjunction with lab-based spectroscopic instruments, includes features to encourage its adoption by analytical chemists, such as having an intuitive graphical user interface with a step-by-step ‘wizard’ for building new ML models, supporting standard file types and data preprocessing, and incorporating well-known standard chemometric analysis techniques as well as new ML techniques for analysis of spectra, so that users can compare their performance. The ML techniques that were developed for this application have been designed based on considerations of the defining characteristics of this problem domain, and combine high accuracy with visualization, so that users are provided with some insight into the basis for classification decisions.
KeywordsSupport Vector Machine Artificial Neural Network Partial Little Square Model Library Kernel Support Vector Machine
Unable to display preview. Download preview PDF.
- 1.Glossary of Terms Related to Chemical and Instrumental Analysis of Fire Debris. IAAI Forensic Science Committee, http://www.fire.org.uk/glossary.htm (Accessed Jan 2008).Google Scholar
- 2.Ferraro, J.R., Nakamoto, K. and Brown, C.W. (2003). Introductory Raman Spectroscopy. Academic Press, San Diego, second edition.Google Scholar
- 4.Howley, T., Madden, M.G., O’Connel, M.L., Ryder, A.G. (2006). “The Effect of Principal Component Analysis on Machine Learning Accuracy with High Dimensional Spectral Data”. Knowledge Based Systems, Vol. 19, Issue 5.Google Scholar
- 5.Hennessy, K., Madden, M.G., Conroy, J., Ryder, A.G. (2005). “An Improved Genetic Programming Technique for Identification of Solvents from Raman Spectra,” Knowledge Based Systems, Vol. 18, Issue 4–5.Google Scholar
- 6.Howley, T. (2007). “Kernel Methods for Machine Learning with Applications to the Analysis of Reaman Spectra”. PhD Thesis, National University of Ireland, Galway.Google Scholar
- 7.Hennessy, K. (2007). “Machine Learning Techniques for the Analysis of Raman Spectra”. PhD Thesis, National University of Ireland, Galway.Google Scholar
- 9.Wold, Svante, and Sjostrom, Michael (1977). SIMCA: A method for analyzing chemical data in terms of similarity and analogy, in Kowalski, B.R., ed., Chemometrics Theory and Application, American Chemical Society Symposium Series 52, Wash., D.C., American Chemical Society, p. 243–282.CrossRefGoogle Scholar
- 11.Liu, H., Li, J. & Wong, L. (2002). A Comparative Study on Feature Selection and Classification Methods Using Gene Expression Profiles and Proteomic Patterns. Genome Informatics, 13, 51–60.Google Scholar
- 14.Luinge, H.J., van der Maas, J.H. & Visser, T. (1995). Partial least squares regression as a multivariate tool for the interpretation of infrared spectra. Chemometrics and intelligent laboratory system, 28, 125–138.Google Scholar
- 15.Madden, M.G. and Ryder A.G. (2002). Machine learning methods for quantitative analysis of Raman Spectroscopy data. In Proceedings of SPIE, Vol. 4876, 1013–1019.Google Scholar