Supervised semi-automated data analysis software for gas chromatography / differential mobility spectrometry (GC/DMS) metabolomics applications

  • Daniel J. Peirano
  • Alberto Pasamontes
  • Cristina E. Davis
Software Applications

Abstract

Modern differential mobility spectrometers (DMS) produce complex and multi-dimensional data streams that allow for near-real-time or post-hoc chemical detection for a variety of applications. An active area of interest for this technology is metabolite monitoring for biological applications, and these data sets regularly have unique technical and data analysis end user requirements. While there are initial publications on how investigators have individually processed and analyzed their DMS metabolomic data, there are no user-ready commercial or open source software packages that are easily used for this purpose. We have created custom software uniquely suited to analyze gas chromatograph / differential mobility spectrometry (GC/DMS) data from biological sources. Here we explain the implementation of the software, describe the user features that are available, and provide an example of how this software functions using a previously-published data set. The software is compatible with many commercial or home-made DMS systems. Because the software is versatile, it can also potentially be used for other similarly structured data sets, such as GC/GC and other IMS modalities.

Keywords

Differential mobility spectrometry (DMS) Field asymmetric ion mobility spectrometry (FAIMS) Principal component analysis (PCA) Partial least squares regression (PLS) Data analysis Software 

Notes

Acknowledgments

Partial funding for this study was provided by: the National Science Foundation (NSF) grant #1255915 [CED], the California Citrus Research Board grant #5100-143 and #1500-159 [CED], The Hartwell Foundation [CED] and the United States Department of the Army grant W15P7T-12-C-A005 [CED], National Institutes of Health (NIH) grant number #1U01EB022003-01 and NIH grant #UL1 TR000002 [CED]. Student support was partially provided by the US Department of Veterans Affairs, Post-9/11 GI-Bill [DJP], and the National Science Foundation grant #1343479 Veteran’s Research Supplement [DJP]. The contents of this manuscript are solely the responsibility of the authors and do not necessarily represent the official views of the funding agencies.

The authors would like to thank members of their research group for Beta testing the software and suggested improvements: Raquel Cumeras, Mitchell M. McCartney, Sierra Spitulski, and Yuriy Zrodnikov.

Supplementary material

12127_2016_200_MOESM1_ESM.png (345 kb)
Supplementary Material 1 Image of the software featuring a list of loaded samples in the Data tab and a visualization of the data for the sample labelled Mild_10, which was used to demonstrate the preprocessing steps in the article. (PNG 344 kb)
12127_2016_200_MOESM2_ESM.png (653 kb)
Supplementary Material 2 Image of the software demonstrating control over the visualization of the data from sample Mild_10 by modifications of the compensation voltage (CV Range), retention time (RT Range) and the viewable data intensity (Z Range). (PNG 652 kb)
12127_2016_200_MOESM3_ESM.png (541 kb)
Supplementary Material 3 Image of the software implementing preprocessing through Savitizky-Golay smoothing and asymmetric least squares (ALS) baseline removal. The numerical settings of both methods of preprocessing can be assigned by the user, and the order of application can also be determined. (PNG 541 kb)
12127_2016_200_MOESM4_ESM.png (469 kb)
Supplementary Material 4 Image of principal component analysis (PCA) executed by the software on the healthy data after outlier detection as shown in Fig. 5b and d. This visualization is generated through the “PCA” button in the interface and incorporates the current settings of the retention time and compensation voltage as well as selected preprocessing applied to the day. The left side shows the scores plot for each sample in the analysis and the number is the associated sample number in Data tab for easy reference. The right side contains a visualization of the loadings for each principal component. (PNG 468 kb)
12127_2016_200_MOESM5_ESM.png (647 kb)
Supplementary Material 5 Image of the list of samples in the Data tab of the software with assigned categories and corresponding classifications for each sample. The checkmark in the box in the column Used indicates if the sample is to be included in principal component analysis (PCA) and model building, while samples without a corresponding checkmark have been identified as outliers and will not be included in the analysis. (PNG 647 kb)
12127_2016_200_MOESM6_ESM.png (620 kb)
Supplementary Material 6 Image of the settings for model building as well as the training regimen used and the option to build a model that can be stored as a file for later application to other samples within the Model tab. (PNG 619 kb)
12127_2016_200_MOESM7_ESM.png (579 kb)
Supplementary Material 7 Image of the numerical evaluation of a model. In the case of multiple categories, multiple models are generated, with one model generated for each classification. A prediction method can be selected by the user to enforce a standard method of prediction evaluation, and a boxplot demonstrating the result of the corresponding model for each sample grouped by the true classification of the sample. In this figure, the model is based on the category Healthy/Sick and the boxplot is demonstrating the individual model for the classification of Healthy, so that the response from the model would have the number 1 indicate healthy and the number 0 indicate not Healthy (in this case, Sick). (PNG 579 kb)
12127_2016_200_MOESM8_ESM.png (492 kb)
Supplementary Material 8 Image of the scores plot and loadings from the first two latent variables (LV) in a multiway partial least squares (nPLS) model as shown in Fig. 6a and c. This visualization is generated through the nPLS button in the prediction tab which can be seen in Supplemental Fig. 7. The numbers in the scores plot on the left indicate the corresponding number to each sample in the Data tab. The legend and coloration of the samples is based on the user defined classifications within the selected category for a given model. As described in the paper and in the caption of Fig. 6, the scores and loadings are based on a full model of the data which incorporates all samples in the training. (PNG 492 kb)

References

  1. 1.
    Cumeras R, Figueras E, Davis CE, Baumbach JI, Gracia I (2015) Review on Ion mobility spectrometry. Part 1: current instrumentation. Analyst 140:1376–1390. doi: 10.1039/c4an01100g CrossRefGoogle Scholar
  2. 2.
    Cumeras R, Figueras E, Davis CE, Baumbach JI, Gracia I (2015) Review on ion mobility spectrometry. Part 2: hyphenated methods and effects of experimental parameters. Analyst 140:1391–1410. doi: 10.1039/c4an01101e CrossRefGoogle Scholar
  3. 3.
    Davis CE et al. (2010) Analysis of volatile and non-volatile biomarkers in human breath using differential mobility spectrometry (DMS). IEEE Sensors J 10:114–122. doi: 10.1109/jsen.2009.2033562 CrossRefGoogle Scholar
  4. 4.
    Krylov EV, Coy SL, Vandermey J, Schneider BB, Covey TR, Nazarov EG (2010) Selection and generation of waveforms for differential mobility spectrometry. Rev Sci Instrum:81. doi: 10.1063/1.3284507
  5. 5.
    Manard M, Weeks S, Kyle K (2008) Monitoring/verification using DMS: TATP example 2008 I.E. conference on technologies for homeland security, vols 1 and 2:226–230 doi: 10.1109/ths.2008.4534454
  6. 6.
    Manard MJ, Trainham R, Weeks S, Coy SL, Krylov EV, Nazarov EG (2010) Differential mobility spectrometry/mass spectrometry: The design of a new mass spectrometer for real-time chemical analysis in the field. Int J Mass Spectrom 295:138–144. doi: 10.1016/j.ijms.2010.03.011 CrossRefGoogle Scholar
  7. 7.
    Kendler S, Lambertus GR, Dunietz BD, Coy SL, Nazarov EG, Miller RA, Sacks RD (2007) Fragmentation pathways and mechanisms of aromatic compounds in atmospheric pressure studied by GC-DMS and DMS-MS. Int J Mass Spectrom 263:137–147. doi: 10.1016/j.ijms.2007.01.011 CrossRefGoogle Scholar
  8. 8.
    Camara M, Gharbi N, Lenouvel A, Behr M, Guignard C, Orlewski P, Evers D (2013) Detection and quantification of natural contaminants of wine by gas chromatography-differential ion mobility spectrometry (GC-DMS). J Agric Food Chem 61:1036–1043. doi: 10.1021/jf303418q CrossRefGoogle Scholar
  9. 9.
    Lu Y, Harrington PB (2007) Forensic application of gas chromatography - Differential mobility spectrometry with two-way classification of ignitable liquids from fire debris. Anal Chem 79:6752–6759. doi: 10.1021/ac0707028 CrossRefGoogle Scholar
  10. 10.
    Rearden P, Harrington PB, Karnes JJ, Bunker CE (2007) Fuzzy rule-building expert system classification of fuel using solid-phase microextraction two-way gas chromatography differential mobility spectrometric data. Anal Chem 79:1485–1491. doi: 10.1021/ac060527f CrossRefGoogle Scholar
  11. 11.
    Krebs MD, Kang JM, Cohen SJ, Lozow JB, Tingley RD, Davis CE (2006) Two-dimensional alignment of differential mobility spectrometer data. Sensors Actuators B Chem 119:475–482. doi: 10.1016/j.snb.2005.12.058 CrossRefGoogle Scholar
  12. 12.
    Fong SS, Rearden P, Kanchagar C, Sassetti C, Trevejo J, Brereton RG (2011) Automated peak detection and matching algorithm for gas chromatography-differential mobility spectrometry. Anal Chem 83:1537–1546. doi: 10.1021/ac102110y CrossRefGoogle Scholar
  13. 13.
    Zhao W, Sankaran S, Ibanez AM, Dandekar AM, Davis CE (2009) Two-dimensional wavelet analysis based classification of gas chromatogram differential mobility spectrometry signals. Anal Chim Acta 647:46–53. doi: 10.1016/j.aca.2009.05.029 CrossRefGoogle Scholar
  14. 14.
    Lu Y, Chen P, Harrington PB (2009) Comparison of differential mobility spectrometry and mass spectrometry for gas chromatographic detection of ignitable liquids from fire debris using projected difference resolution. Anal Bioanal Chem 394:2061–2067. doi: 10.1007/s00216-009-2786-9 CrossRefGoogle Scholar
  15. 15.
    Cheung W, Xu Y, Thomas CLP, Goodacre R (2009) Discrimination of bacteria using pyrolysis-gas chromatography-differential mobility spectrometry (Py-GC-DMS) and chemometrics. Analyst 134:557–563. doi: 10.1039/b812666f CrossRefGoogle Scholar
  16. 16.
    Eiceman GA, Wang M, Prasad S, Schmidt H, Tadjimukhamedov FK, Lavine BK, Mirjankar N (2006) Pattern recognition analysis of differential mobility spectra with classification by chemical family. Anal Chim Acta 579:1–10. doi: 10.1016/j.aca.2006.07.013 CrossRefGoogle Scholar
  17. 17.
    Krebs MD, Mansfield B, Yip P, Cohen SJ, Sonenshein AL, Hitt BA, Davis CE (2006) Novel technology for rapid species-specific detection of Bacillus spores. Biomol Eng 23:119–127. doi: 10.1016/j.bioeng.2005.12.003 CrossRefGoogle Scholar
  18. 18.
    Prasad S et al. (2008) Constituents with independence from growth temperature for bacteria using pyrolysis-gas chromatography/differential mobility spectrometry with analysis of variance and principal component analysis. Analyst 133:760–767. doi: 10.1039/b716371a CrossRefGoogle Scholar
  19. 19.
    Aksenov AA et al. (2014) Detection of huanglongbing disease using differential mobility spectrometry. Anal Chem 86:2481–2488. doi: 10.1021/ac403469y CrossRefGoogle Scholar
  20. 20.
    Arasaradnam RP et al. (2014a) Detection of colorectal cancer (CRC) by urinary volatile organic compound analysis. PLoS One 9. doi: 10.1371/journal.pone.0108750
  21. 21.
    Arasaradnam RP et al. (2014b) Differentiating coeliac disease from irritable bowel syndrome by urinary volatile organic compound analysis - a pilot study. PLoS One:9. doi: 10.1371/journal.pone.0107312
  22. 22.
    Basanta M et al. (2010) Non-invasive metabolomic analysis of breath using differential mobility spectrometry in patients with chronic obstructive pulmonary disease and healthy smokers. Analyst 135:315–320. doi: 10.1039/b916374c CrossRefGoogle Scholar
  23. 23.
    Covington JA et al. (2013) Application of a novel tool for diagnosing bile acid diarrhoea. Sensors 13:11899–11912. doi: 10.3390/S130911899 CrossRefGoogle Scholar
  24. 24.
    Rutolo M, Covington JA, Clarkson J, Iliescu D (2014) Detection of potato storage disease via gas analysis: a pilot study using field asymmetric ion mobility spectrometry. Sensors 14:15939–15952. doi: 10.3390/s140915939 CrossRefGoogle Scholar
  25. 25.
    Schivo M et al. (2013) A mobile instrumentation platform to distinguish airway disorders. J Breath Res:7. doi: 10.1088/1752-7155/7/1/017113
  26. 26.
    Shnayderman M et al. (2005) Species-specific bacteria identification using differential mobility spectrometry and bioinformatics pattern recognition. Anal Chem 77:5930–5937. doi: 10.1021/ac050348i CrossRefGoogle Scholar
  27. 27.
    Peirano DJ, Aksenov AA, Pasamontes A, Davis CE (2013) Chapter 18: approaches for establishing methodologies in metabolomic studies for clinical diagnostics. In: Agah A (ed) Medical applications of artificial intelligence. CRC Press, Taylor Francis Group, pp. 279–304Google Scholar
  28. 28.
    Stevenson PG, Conlan XA, Barnett NW (2013) Evaluation of the asymmetric least squares baseline algorithm through the accuracy of statistical peak moments. J Chromatogr A 1284:107–111. doi: 10.1016/j.chroma.2013.02.012 CrossRefGoogle Scholar
  29. 29.
    Bromba MUA, Ziegler H (1981) Application hints for Savitzky-Golay digital smoothing filters. Anal Chem 53:1583–1586. doi: 10.1021/ac00234a011 CrossRefGoogle Scholar
  30. 30.
    Madden HH (1978) Comments on Savitzky-Golay convolution method for least-squares fit smoothing and differentiation of digital data. Anal Chem 50:1383–1386. doi: 10.1021/ac50031a048 CrossRefGoogle Scholar
  31. 31.
    Nevius TA, Pardue HL (1984) Development and preliminary evaluation of modified Savitzky-Golay smoothing functions. Anal Chem 56:2249–2251. doi: 10.1021/ac00276a061 CrossRefGoogle Scholar
  32. 32.
    Wold S, Geladi P, Esbensen K, Öhman J (1987) Multi-way principal components-and PLS-analysis. J Chemom 1:41–56. doi: 10.1002/cem.1180010107 CrossRefGoogle Scholar
  33. 33.
    Geladi P, Kowalski BR (1986) Partial least-squares regression: a tutorial. Anal Chim Acta 185:1–17. doi: 10.1016/0003-2670(86)80028-9 CrossRefGoogle Scholar
  34. 34.
    Zhao W, Bhushan A, Santamaria AD, Simon MG, Davis CE (2008) Machine learning: a crucial tool for sensor design. Algorithms 1:130–152. doi: 10.3390/a1020130 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Daniel J. Peirano
    • 1
  • Alberto Pasamontes
    • 1
  • Cristina E. Davis
    • 1
  1. 1.Mechanical and Aerospace EngineeringUniversity of California, DavisDavisUSA

Personalised recommendations