Introduction

The ultimate goal of metabolomics is the comprehensive study of the low molecular weight molecules within an organism. Metabolites are the result of both biological and environmental factors and, as such, provide great potential to bridge knowledge of genotype and phenotype. Metabolomics is often likened to its proteomics sibling and has leveraged proteomics experience, but the field has evolved with inherently different challenges including the identification process. Peptides and proteins are typically a linear polymer and can be sequenced. Proteins are inferred by matching of identified experimental peptides against in-silico fragmentation spectra. Metabolites are more challenging to annotate. These small molecules often lack a common building block, although there is common use of the elements C, H, O, N, S, P, and potentially heteroatoms. The idea that untargeted mass spectrometry (MS)-based metabolomics analysis will result in a large list of ‘identified’ small molecules that can be mapped to networks and pathways is often assumed, yet high confidence analyte assignments/identifications may not be made owing to the fundamental challenges of the metabolomic identification processes. For example, features (i.e., mass-to-charge ratio and retention time pairs) can be assigned to a vast number of tentative or preliminary structures, or there may be no candidate matches in curated databases. Because metabolites lack a genetic template such as that for proteins, metabolomics databases are currently considered incomplete. In-silico metabolite databases can provide guidance, but validation of retention times and MS/MS fragmentation data with a reference standard is nearly always required for confident metabolite identification.

Since its inception, the metabolomics field focus has shifted from detecting changes to understanding the biology leading to the changes [1], and thus the accuracy of metabolite assignments is extremely important. In this Critical Insight, we will discuss various challenges inherent to LC-MS-based metabolomics and describe the ranges of confidence for small molecule annotations when performing global metabolomic analyses, a concept essential for applying metabolomic data toward a better understanding of the mechanisms of human health and disease.

LC-MS-Based Metabolomics: Strengths and Challenges

Metabolomics experiments aim to characterize diverse classes of small molecules from a variety of sample types (e.g., cell extracts, culture media, urine, serum, etc.). The metabolomics community has been influenced by numerous aspects of proteomics methodology such as separation technologies, state-of-the-art instrumentation, and data processing approaches. However, there are fundamental differences in MS-based metabolomics versus proteomics that are important to recognize. Table 1 outlines the strengths and challenges in contemporary metabolomics relative to proteomic analyses.

Table 1 Current Scale, Strengths, and Challenges of MS-Based Metabolomics and MS-Based Proteomics

The annotated human metabolome is considered to be less complex than the proteome [2, 3], yet the diverse chemical structures exhibit a wide range of concentration, solubility, polarity, and volatility [4]. Proteomics samples often require a multi-step preparation that may involve cell lysis, purification, enzymatic digestion, and solid phase extraction [5]. Sample preparation for metabolomics involves cell lysis and metabolite extraction [6], although purification and fractionation can also be performed. Metabolomics analyses are challenged by an analyte’s rapid temporal dynamics and sample composition reflecting endogenous and exogenous species (e.g., drugs, toxins, microorganisms, and nutrients) [1]. While proteomic analyses can often differentiate organism species based on protein sequence [6] (which is particularly useful in microbiome studies), species determination in metabolomics is challenging because many small molecules are common across different organisms [7]. This can, however, be advantageous for metabolomics animal model studies as knowledge of physical properties guiding identifications can be shared across species. Another major difference between proteomic and metabolomic technologies involves the interpretation of fragmentation data. Known protein sequences and enzyme cleavage patterns enable predictable peptide sequences and fragmentation spectra. Further, the large size of protein molecules often results in multiple peptides being observed, thus increasing confidence of protein identification. This is in contrast to metabolomics studies, where the small size and wide array of molecular structures of metabolites results in a singular species with no consensus fragmentation pattern.

Targeted and Untargeted Metabolomic Studies

Figure 1 outlines the goals and the types of data sets that are generated in targeted and untargeted/global metabolomic studies. In general, targeted approaches are aimed at identifying and quantifying a limited number (tens to hundreds) of known metabolites, such as those commonly encountered in clinical analyses. Many untargeted or hypothesis generating approaches focus on acquiring data for as many species as possible, annotating metabolites, and reviewing both known and unknown metabolic changes. Data can be used for relative quantification across sample groups and to provide hypotheses that can be further studied with targeted approaches. There are two broad approaches for data acquisition in untargeted metabolomics studies. The first method uses full scan MS1 to generate accurate mass measurements for individual molecules (i.e., features) to permit statistical calculations followed by data-dependent acquisition (DDA) of a subset of samples to guide identifications. Similar to conventional proteomics techniques, metabolomics DDA methods generate fragmentation patterns for metabolites exhibiting the highest signal intensity. A second untargeted metabolomics approach is based on data independent acquisition (DIA), where workflows integrate full MS1 with MS/MS fragmentation for all precursor ions either simultaneously (MSE [8]) or in finite mass ranges (SWATH [9]). DIA methods produce complicated fragmentation spectra and the link between precursor and product can be difficult to decipher. In downstream data analysis steps, fragment ions are matched with precursor ions based on retention time, mass, and drift time (when applicable). DIA allows fragmentation data to be acquired regardless of metabolite signal intensity. Both DDA and DIA approaches ultimately define features with mass-to-charge ratio (m/z), retention time (RT), and drift time (DT) descriptors. In the identification step, precursor ions and corresponding fragment ions are searched against databases for metabolite assignments.

Figure 1
figure 1

Untargeted versus targeted metabolomics studies. Untargeted or discovery-based metabolomics focuses on global detection and relative quantitation of small molecules in a sample. In contrast, targeted or validation-based metabolomics focuses on measuring well-defined groups of metabolites with opportunities for absolute quantitation

One major advantage of untargeted metabolomics is the collection of data without preexisting knowledge; however, this is accompanied with the caveat that certainly sample preparation and analytical methods have a direct impact on the qualitative results that are obtained. Owing to the diverse composition of the metabolome [10], sample preparation steps, separation methods, and instrument platform and parameters will influence the subset of metabolites detected.

Analytical Platforms

A variety of separation [liquid chromatography (LC), gas chromatography (GC), and capillary electrophoresis (CE)] and detection [MS and nuclear magnetic resonance (NMR)] methods are used for metabolomics experiments. We focus on LC-MS-based metabolomics as it has become a leading technology for both polar and nonpolar small molecule analyses and draws many parallels with LC-MS-based proteomics analyses referenced herein. LC methods are time-consuming (minutes to hours) compared with direct infusion or flow injection analyses (seconds to minutes) [6]. However, the ability of LC to increase both selectivity and data content makes it invaluable [11], particularly for complex metabolomics samples such as human blood where an average of three isomers or isobars per nominal mass are estimated [12].

The coupling of ion mobility (IM) separations with LC-MS based analyses represents an emerging technology (LC-IM-MS) for metabolomics research. Ion mobility resolves gas phase ions based on their size-to-charge ratio or gas phase packing efficiency, complementing polarity and mass separations. The addition of ion mobility separation offers increased peak capacity [2], the ability to decrease chromatography time without sacrificing resolution, and opportunities to separate co-eluting precursors [12]. Rapid (milliseconds) IM separations are well integrated into time scales of most MS platforms; multiple IM spectra are acquired for each LC peak, and multiple mass spectra (microsecond time scale) are acquired for each IM spectrum [13]. In addition to improved mass spectra quality and increased selectivity, IM measurements can be used to determine collision cross sections (CCS) for individual metabolites. Unlike RT measurements that vary based on column chemistry, mobile phase, and elution gradient, CCS values are physical properties and not influenced by MS or LC settings. Inter-laboratory CCS precision is reported to be less than 5% for a broad range of molecules assayed [14]. Improvements to this precision are rapidly evolving with the development of more standardized protocols for CCS measurements.

Analysis and Identification

Untargeted metabolomics data processing workflows incorporate several defined steps including noise filtering, peak detection, peak deconvolution, retention time alignment, and finally feature annotation. Importantly, features are not always metabolites; related species (e.g., isotopes, neutral losses, adducts) of a single metabolite may be present with different m/z values. Metabolite identification is necessary to draw biological conclusions from untargeted metabolomics data. Analyte identification can be performed by searching the experimental MS1 or MS/MS data through databases available to the public for free (e.g., ChemSpider (http://www.chemspider.com), METLIN [15], Human Metabolome DataBase (HMDB) [16], MassBank [17], mzCloud (https://www.mzcloud.org), GNPS (http://gnps.ucsd.edu/), and LipidBlast [18]), or for a nominal fee [e.g., NIST Mass Spectral Library (http://chemdata.nist.gov)]. Batch searching MS/MS fragmentation spectra within these databases, however, is often not possible without commercial software. Given that numerous libraries are generally queried to maximize metabolome coverage, bioinformatics efforts are necessary to remove or reduce match redundancy. This process can be complicated since metabolite nomenclature is not entirely standardized and varies greatly by database.

Feature annotation is performed by comparing an experimental mass measurement to a database of known metabolites within a mass tolerance window to generate potential candidates. Thus, the development of high-resolution high-mass accuracy mass instruments has proven to be invaluable for discovery (MS1) and heuristic validation (MS2) metabolomics efforts. As illustrated in Figure 2, it is difficult for MS mass measurement alone to provide metabolite information beyond molecular formula. Kind and Fiehn demonstrated that high mass accuracy measurements (<1 ppm error) were inadequate for determining the elemental composition of numerous metabolites [19]; notably, the authors later showed that isotope ratio measurements were more important than mass accuracy for determining the most probable elemental composition for small molecules [20]. Additional information, such as fragmentation data, is essential for structure elucidation of a mass measurement. Putative identifications require matching an experimental MS/MS spectrum with a reference fragmentation spectrum [21, 22]. Metabolomics spectral libraries have been created with experimental data from commercially available or synthesized standards. Significant efforts are being made to routinely update content as new compounds are analyzed; as such, these libraries are considered incomplete [23]. MS/MS data is often insufficient to differentiate structural and stereo-isomers. Orthogonal evidence is needed in these cases and when experimental MS/MS data is non-discriminating. LC and IM can be used to generate retention time and collision cross section information, respectively. Both of these separation methods are capable of resolving some isomeric/isobaric species. IM has even shown utility for differentiating lipids based on position of double bond, which is often unable to be accomplished by LC [24]. MS-based metabolomic studies are performed on numerous different instrument platforms; ion intensities and fragmentation patterns vary based on analytical conditions including instrument, ionization source, and collision energy [17]. For small molecules, a collision energy that depletes some precursors may have little effect on others. The selection of isobaric co-eluting precursor ions for fragmentation may further complicate experimental MS/MS data. MS/MS matching can be subjective. Scores are generated to represent similarities between experimental data from an unknown versus that from the standard, and assignments are often made using the best match. False positives and false negatives may be the result of low quality spectra and incomplete databases, respectively. Many opportunities exist for the development of methods to calculate these unknowns, as well as a confidence metric for scoring MS/MS matches [11].

Figure 2
figure 2

An illustration of the amount of information density present at different levels of mass measurement accuracy, using the validated entries in the PubChem compound database. (a) The distribution of molecules in the PubChem compound database between 0 and 1000 Da, as surveyed in 2007, 2011, and 2015. As new compounds are discovered and archived, the distribution has shifted to lower mass, with most entries currently centered between 100 and 600 Da. Theoretical molecular formulas determined from chemical stability rules are illustrated by the dotted line, indicating that most of these entries are isomers. The inset zooms in on a 10 Da window where over half a million compounds are represented. (b) At increasing levels of mass accuracy, the number of possible molecular formulas can be reduced to a few thousand, but in one extreme case shown at 1 ppm, one formula is represented by over 10,000 isomers in the database. Mass spectrometry can significantly reduce complexity, but it cannot fully address molecular characterization without other dimensions of information. Reproduced with permission of Annual Review of Analytical Chemistry, Volume 9 by Annual Reviews, http://www.annualreviews.org from reference [2]

A subset of experimental metabolomics data does not match any database entry. Null matches may represent truly new metabolites or simply known metabolites that are missing from or do not match the spectral database (e.g., in-source fragments, metabolites modified by enzyme activity, etc.) [21]. Characterization of these “unknown” unknowns requires significant effort, such as that often encountered in natural product discovery of secondary metabolites [25, 26]. Algorithms geared toward predicting and comparing small molecule in silico and experimental MS/MS data are also currently available (e.g., MetFrag [22]), however, significant opportunities exist for the refinement and further development of these tools. The addition of IM data is informative; mobility-mass correlations as well as CCS/mass ratios can guide unknown identifications by giving an idea of molecular class and by excluding unlikely candidates on the basis of structure. Established metabolomics labs and metabolomics centers have fixed chromatography methods that are robust, reliable, and yield stable retention times. High quality RT and MS/MS fragmentation data of pure reference standards have been acquired for in-house libraries. These efforts certainly facilitate identification confidence, but are not feasible for most small research groups. Thus, leveraging methodologies and data with shared knowledge will benefit the entire metabolomics community.

Confidence Levels

Metabolite annotation is the crucial link between acquired data and meaningful biological information. It is essential that the confidence of metabolite assignments is transparent. In 2007, the Chemical Analysis Working Group (CAWG) of the Metabolomics Standards Initiative (MSI) published a first stage of guidelines for reporting the minimum metadata relative to metabolite identification as a means to communicate the confidence of identifications [27]. Recently, revisions to these levels have been proposed to cover special cases where level determination may be unclear [28, 29]. We propose modest changes to include orthogonal IM-MS data as evidence for metabolite identification (Figure 3).

Figure 3
figure 3

Proposed workflow for metabolite identification confidence using multidimensional mass spectrometry. From top to bottom: obtaining an exact mass measurement for a Unique Feature (Level 5) allows database searching, which here is illustrated by the over 61 million compounds indexed in PubChem at the time of this review. Subsequent levels of mass accuracy reduce the number of possible molecular formulas from over 200,000 (unit resolution), to ca. 10,000 at 1 ppm mass accuracy for the example mass of 354 Da. Using higher mass accuracy and/or a heuristic filtering approach obtains a unique Molecular Formula (Level 4), which still represents several thousand isomeric compounds. Tentative Structures (Level 3) match precursor m/z to a metabolite database, and Putative Identifications (Level 2) match fragmentation data to metabolite MS/MS libraries. Obtaining a Validated Identification (Level 1) requires additional data evidence, such as tandem MS/MS, LC, IM, or measurements from other analytical techniques (optical spectroscopy or NMR) that match corresponding reference standard data under identical experimental conditions. Right portion of figure modified with permission of Annual Review of Analytical Chemistry, Volume 9 by Annual Reviews, http://www.annualreviews.org from reference [2]

The highest confidence identification, a validated identification (Level 1), confirms a structure with a minimum of two orthogonal properties from a pure reference standard acquired under identical analytical conditions. A lack of reference standard acquisition but predictive or externally acquired structure evidence, namely MS/MS data, exhibiting diagnostic fragments or neutral losses consistent with a specific structure would be considered a putative identification (Level 2). Preliminary identifications (Level 3) arise when accurate mass and isotopic distribution patterns produce tentative structures from database searches. Note, a single molecular formula typically renders multiple candidate structures. Molecular formula candidates (Level 4) and a deconvoluted experimental m/z (Level 5) complete the less confident annotation classifications.

Suggestions have been made to clarify the set of reporting standards with the inclusion of an evidence-based quantitative score [30]. With either a score- or level-based system, the future of annotation is likely to be influenced by multiplexed technologies. Recently, Pacini et al. obtained five levels of small molecule data in a single DIA acquisition (LC, UV, IM, MS, MS/MS) [31]. Advances of multidimensional analytical approaches are inherently the most promising for the broadest metabolome coverage. Orthogonal in-line data can provide the needed evidence to meet minimum data requirements for confident identifications. At the present time, LC and IM are successfully multiplexed with MS providing RT and CCS data, respectively. As IM-MS becomes more widespread and CCS data is populated in searchable metabolomic databases and libraries, identifications using this knowledge will increase metabolite assignment confidence. For metabolite candidates that lack an available reference standard, a quantitative structure retention relationship (QSRR) model can predict retention times [32] and computational calculations can estimate CCS values [33]. The class-specific relationship in IM is also valuable evidence to support both annotations of metabolites and exclusion of unlikely candidates. For example, only correlated molecular classes based on IM trend lines or retention times based on polarity may be considered for identification purposes.

False Discovery Rate

False positive identifications are a significant challenge for metabolomics. As described above, annotations arise by querying neutral mass against a database of candidate small molecule masses. Neutral masses are inferred from experimental m/z, and thus the presence of related isotope and adduct features may complicate neutral mass determination and potentially lead to false positive identifications. False positives can also arise during MS/MS spectrum matching. Statistical tools for estimating the error of metabolite-spectrum matches are necessary for evaluating the confidence of annotation results. The inclusion of experimental orthogonal data such as RT and CCS data will decrease false positives, but there is currently no agreed upon metric to assess false discovery rate (FDR) of metabolite identifications. In MS-based proteomic studies, target-decoy search-based FDR calculations are widely accepted [34, 35]. Briefly, predicted peptide MS/MS spectra are used to create a reverse decoy database, and experimental data matches are used to estimate FDR. In principle, a target-decoy strategy could be utilized for metabolomics using a small molecule set exclusive of the experimental species, but since metabolomics databases are incomplete, this approach is currently challenging [23]. There are reports of novel FDR methods exclusive of decoy approaches, though none has yet gained widespread acceptance. For example, a simulation model that uses the rate of a match for elemental composition search queries [36], the incorporation of a spectrum similarity score with a completion score for GC × GC/TOF-MS data [37], and a mixture modeling method coined GREAZY for phospholipids [38] have all been suggested. Querying predicted molecular formula of experimental data against a decoy set of theoretically possible candidates has been proposed; however, the inflated search space increases the chance of a false positive identification and it is difficult to distinguish artificial compositions from legitimate candidate metabolites [36]. Potential inclusion of the Seven Golden Rules [20] may offer an approach to classify the legitimate candidate list for exclusion from the decoy elemental composition list.

Biological Analysis of Metabolites

Biological interpretation of metabolomics data and, ultimately systems biology studies, hinges on the ability to accurately identify metabolites so they can be mapped to pathways and networks. Data from an untargeted metabolomics experiment is challenging to visualize and interpret due to the high density of measurements. This challenge is amplified by the fact that numerous features are identified with varying levels of confidence. Table 2 outlines several open source options for analyzing metabolomics data depending on identification confidence level (unique feature to validated identification, described in Confidence Levels section above). Subsets of data may be analyzed using different tools based on the data obtained (e.g., MS/MS fragmentation spectra is often only available for higher abundance ions from DDA analyses). Statistical and multivariate analyses are applied to prioritize data; multiple hypothesis testing, data dimension reduction (e.g., principal component analysis (PCA) scores or loadings plots and partial least squares (PLS) modeling), and data visualization (e.g., cloud plots [39]) and clustering (e.g, self-organizing map (SOM) [40]) can reveal altered ion abundances and patterns that may be characteristic of the phenotype.

Table 2 Confidence Annotation, Statistical Evaluation, and Selected Bioinformatics Tools

Most of the existing analysis tools require a list of identified metabolites to integrate biological knowledge [4245]. New techniques for placing small molecules in a biological context are now being presented, relying on the integration of systems biology tools. For example, genomic and metabolomic data have been combined in a mining workflow to identify pharmaceutical candidates [26]. Another innovative approach utilizes the fact that single upstream biological disruptions result in a cascade of metabolomic changes. The creation of informatic strategies, such as mummichog [41], which predict biological activity from MS1 data rather than formal MS2-dependent identifications, is an attractive concept as it circumvents identification challenges. Importantly, a high level of agreement between identifications from mummichog results and conventional identification pipelines is found. This software uses the accurate mass of m/z features to map candidate metabolites to genome-scale metabolic networks and calculates local enrichment of metabolites to distinguish those networks from a stochastic distribution of metabolites [41]. Modules are generated to reveal areas of network activity, as illustrated in Figure 4, which presents a comparison of metabolomic profiles of glucose 6-phosphate dehydrogenase deficient (G6PDd) and normal human erythrocytes. These data are then used to focus additional efforts on validating the prioritized metabolites from the multitude of possibilities, including isomeric species, from database searching.

Figure 4
figure 4

Network module output from mummichog analysis of the qualitative and relative quantitative differences in metabolomic profiles of G6PDd deficient versus normal human erythrocytes. Feature m/z values and significance measurements were used to predict metabolic activity networks without the use of conventional MS/MS identification workflows. Metabolites are colored blue (negative fold change) or red (positive fold change) and the size/color intensity represents the magnitude of fold change

Summary

This is an exciting time for metabolomics research. Tremendous successes have been made to establish the necessary foundation for the field to mature. The metabolomics community now has the opportunity to address the high-reward challenges associated with MS/MS data interpretation, database content, isomer resolution, identification confidence, and FDR estimation. Innovative research and development is essential, particularly at the interface of biomedical, cheminformatics, and bioinformatics fields. The metabolome is thought to be most predictive of phenotype; thus, novel ideas that address these challenges will allow the field to better understand mechanisms underlying health and disease.