Study design and subjects
The Karlsruhe Metabolomics and Nutrition (KarMeN) study is a cross-sectional study that was performed at the Max Rubner-Institut (MRI) in Karlsruhe, Germany, between 2011 and 2013 and was described in detail by Bub et al. .
After medical examination, 301 healthy, normal weight and overweight adults (129 females, 172 males) 18–80 years of age were included into the study. All subjects were non-smokers and took no medication known to influence energy metabolism or body composition. Exclusion criteria included the presence of acute or chronic illness and pregnancy or breastfeeding. Additionally, if a participant did not follow the study protocol, their results were excluded. In females, all measurements were performed during the luteal phase of the menstrual cycle to reduce cycle-related impact on metabolites and to avoid blood contaminating urine samples.
Following a 10-h overnight fast, body weight and height were measured following standardized operation procedures: weight was measured without shoes, with light clothing/underwear, and after voiding at an accuracy of 0.1 kg, height to the nearest 0.5 cm with a stadiometer (seca, Hamburg, Germany).
REE was measured by indirect calorimetry (IC) using a ventilated hood system (Vmax Encore 29 n, SensorMedics BV, Bilthoven, The Netherlands). In accordance with the recommended best practice guidelines , the subjects lay down in comfortable beds in a quiet special study room. Room temperature was 22–24 °C at a constant humidity. The IC measurement started with an initial 10-min period to accustom the participant to the device and test conditions. Subsequently, a 20-min recording period followed under strict resting conditions. Flow calibration was performed by a 3-L syringe, and gas analyzers were calibrated before. Data were collected every 20 s, and acquired volume of oxygen (VO2) and carbon dioxide (VCO2) were converted to REE (kcal/24 h) using the equation of Weir (3.940 × VO2 + 1.106 × VCO2) × 1440 − 2.17 × N2 (VO2 and VCO2 measured in ml/min; N2: nitrogen in g/24 h) . Nitrogen excretion was measured from 24-h urine collections.
Body composition was assessed by dual-energy X-ray absorptiometry (DXA) (Lunar iDXA™ GE Healthcare, USA) immediately after the IC and standard anthropometry measurements. For the analysis of fat mass, bone mass, and LBM enCORE Software v16 was used. Appendicular LBM is the sum of Arm and leg LBM. Trunk LBM was calculated by subtracting appendicular LBM from whole body LBM.
Urine and plasma sampling
The day before examinations the participants collected their urine over a period of 24 h. Collection bottles were kept in cool bags with cooling units throughout. The completeness of the 24-h urine collection was verified using the PABA method (para-aminobenzoic acid) through HPLC-UV by the method of Jakobsen et al. . Upon delivery of the 24-h urine samples in the study center, the volume was recorded. Subsamples were centrifuged at 1850×g at 20 °C and aliquots were stored at − 196 °C until analysis.
Blood samples were collected from an antecubital vein into 9-mL EDTA plasma tubes (S-Monovette, Sarstedt, Nümbrecht, Germany). Blood was centrifuged at 1850×g at 4 °C and aliquoted into small portions. In addition, serum samples (S-Monovette Z-gel, Sarstedt, Nümbrecht, Germany) were collected for standard clinical biochemistry analyses. All samples were initially frozen at − 20 °C for 1 day and then cryopreserved at − 196 °C until analysis. Quality control (QC) samples were prepared by pooling fasting plasma samples and 24-h urine samples, respectively, from KarMeN participants. These QC samples were used for all analytical methods applied.
To obtain a preferably broad coverage of the metabolome of human biofluids, a number of different targeted and non-targeted analytical methods were applied. This section provides a short overview of the different analytical methods used. Details are available in the supplement of Rist et al. 
Non-targeted GC × GC–MS analysis of plasma and urine samples
All 24-h urine and fasting plasma samples were analyzed by non-targeted GC × GC–MS using a Shimadzu GCMS QP2010 Ultra instrument equipped with a ZOEX ZX2 modulator according to the method established by Weinert et al. . With this method a wide range of metabolites can be detected, such as amines, amino acids, organic acids, sugars, sugar alcohols, other polyols etc.
Semi-targeted GC–MS analysis of sugar species in urine samples
As some isomeric sugar species cannot be sufficiently resolved with the non-targeted GC × GC–MS approach  but may play an important role in human metabolism, a complementary targeted GC–MS sugar profiling method was developed for urine samples using a Shimadzu GCMS QP2010 Ultra instrument. Overall 66 metabolites, consisting of 40 known sugar species, 15 unknown sugar species, and 11 non-sugar-compounds were detected with this method.
Targeted GC–MS analysis of fatty acids in plasma
The chromatographic separation of plasma fatty acids usually requires the application of specialized polar columns and can thus not be done adequately using a standard apolar × medium-polar GC × GC column setup. For this reason, we used the method described by Ecker et al.  with minor modifications to determine plasma fatty acids as methyl esters. Using a GC single quadrupole instrument (Shimadzu GCMS QP2010 Ultra) and a BPX90 column (Trajan Scientific), 48 fatty acids could be determined in plasma.
LC–MS metabolite profiling using the Absolute IDQ™ p180 kit
Acyl carnitines, amino acids, biogenic amines, phosphatidylcholines, and sphingomyelins were determined by LC–MS in fasting plasma samples using the Absolute IDQ™ kit developed by Biocrates AG (Innsbruck, Austria) .
Targeted LC–MS analysis of methylated amino compounds
A targeted quantification UPLC–MS/MS method for seven amino compounds in plasma, including L-carnitine, choline, and trimethylamine-N-oxide (TMAO) was established using an Acquity UPLC H-Class system coupled to a Xevo TQD triple quadrupole MS (both from Waters, Eschborn, Germany) .
Targeted LC–MS analysis of bile acids
Analyses of 14 bile acids were done from fasting plasma using a 1200 series HPLC system (Agilent, Waldbronn, Germany) coupled to a Q-Trap 3200 mass spectrometer (AB Sciex, Darmstadt, Germany) as described in Frommherz et al. .
Non-targeted NMR analysis of plasma and urine samples
All plasma and urine samples were analyzed by 1D-1H-NMR spectroscopy as described in Rist et al. [19, 34]. Typically, metabolites that can be detected include organic acids, amino acids, amines, sugars, sugar alcohols, and others.
Standard clinical biochemistry
Calcium, chloride, potassium, sodium, and phosphate concentrations were determined in a 24-h urine specimen. Due to potential interferences with metabolomics analyses, urine had not been acidified by hydrochloric acid. Calcium, chloride, potassium, sodium, phosphate, and also iron concentrations, as well as bilirubin, LDL-, HDL-, and total cholesterol, triglycerides, glucose, uric acid, urea, free T3, and free T4 thyroid hormone concentrations were determined in blood serum. Analyses were carried out by the medical laboratory MVZ Labor PD Dr. Volkmann und Kollegen GbR (Karlsruhe, Germany) which is an accredited lab according to DIN EN ISO 15189:2001, using standard analytic procedures. Creatinine was quantified in 24-h urine specimens using a photometric assay based on the Jaffé reaction (DetectX® Urinary Creatinine Detection Kit; Arbor Assays, Ann Arbor, MI, USA). Total urinary nitrogen was quantified by the Kjeldahl method [35, 36]. 25-hydroxyvitamin D and its epimer were quantified by an in-house LC–MS/MS method using serum calibrators and controls from Chromsystems (Gräfelfing, Germany).
GC × GC–MS raw data files were processed by non-targeted alignment with in-house developed R-modules as described recently . Signal intensity drift, i.e., intra- and inter-batch effects occurring during the 4–5 week measurement period were corrected by means of regularly injected quality control (QC) samples [38,39,40]. For the data of the semi-targeted GC–MS analysis of sugar species in urine, an automatic method for integration was prepared using the Postrun Analysis feature of GCMSSolution (v 4.1.1.). An excel table with integrated peak areas of the chosen substances was made for further data processing.
LC–MS metabolite profiling (Absolute IDQ™ p180 kit)
To analyze the samples of the entire study, five Absolute IDQ well plates were used. To account for possible batch effects between the plates, data normalization as described in the manufacturer’s user manual was applied based on the pooled QC samples, which were extracted and measured ten times on each well plate in between study samples.
All spectra were automatically phased with the Bruker AU program apk0.noe. Using the programme AMIX (v 3.9.14.) (Bruker, Rheinstetten, Germany), plasma spectra were then referenced to the EDTA signal at 2.5809 ppm and bucketed graphically, such that buckets wherever possible contained only one signal or group of signals and no peaks were split between buckets. Urine spectra were resampled to bring them to a uniform frequency axis. Then, spectra were aligned by “correlation optimized warping”  and bucketed using an in-house developed software based on Python, again trying to define buckets that contain only one signal or group of signals and not splitting peaks between buckets whenever possible. Identification of metabolites important for prediction of REE, LBM, or sex was achieved with Chenomx NMR Suite 8.1 (Chenomx, Edmonton, Canada).
Data of the different analytical platforms were integrated into a common data matrix, consisting of 301 samples and > 1000 analytes (including knowns and unknowns). Analytes with a detected frequency lower than 75% in the study samples were eliminated from the data matrix prior to statistical analysis. Non-detected values were replaced by values corresponding to 1/10 × limit of quantitation (LOQ) in targeted methods, where no limit of detection (LOD) was determined; 1/2 × LOD in methods, where LOD was determined/available, 1/2 × minimal intensity for non-targeted MS-based methods.
The columns of this common data matrix were mean centered and scaled by standard deviation prior to analysis. This transformation leads to a uniform scale (mean = 0, SD = 1) for all analytes so that they are comparable between analytical platforms. Using the raw values without scaling would put more weight on the platform that produces the highest absolute values . The resulting matrix was used as input for three different prediction models [support vector machine (SVM) with linear kernel, generalized linear model net (glmnet), and partial least squares (PLS)]. The prediction performance of these models is dependent on model specific hyperparameters which have to be optimized. For example, SVM uses a cost parameter C that controls the trade-off between complexity of the decision function and training error. In glmnet, parameters α and λ are tuned, and in PLS, the number of components (ncomp) is tuned. To find the optimal value for the hyperparameter, a grid search in conjunction with a nested 5 × 10-fold cross-validation (CV) scheme  was applied, and the average of the resulting 50 values was used in the final model.
REE, LBM and sex were treated as dependent variables in the prediction models. Metabolites from targeted and non-targeted metabolomics methods (GC × GC–MS, GC–MS, LC–MS, and NMR) as well as standard clinical biochemistry were used as independent variables. When the model was used for continuous outcomes, the root mean squared error (RMSE) and R2 were calculated to estimate performance of the predictions. Otherwise, each of these algorithms uses a labeled data set to produce a classifier that can predict the class label of a new person. Here, classes were defined as tertiles (lowest, middle, and highest tertile). Tertiles of REE (and LBM) were treated as dependent variables in the prediction models. Based on the data matrix, the algorithms tried to classify subjects to the highest or lowest tertile of REE (and LBM), respectively. When the model was used to predict sex or categorically considered REE (or LBM), the classification accuracy was assessed.
Finally, we showed a ranking of the top 20 analytes in the metabolite patterns of REE, LBM, and sex. Therefore, analytes were assigned a rank for each algorithm according to their weight, the ranks of the three algorithms were averaged, and analytes sorted according to mean rank. A big advantage of the mean rank is that important metabolites to predict REE, LBM and sex could be identified by each of the different machine learning algorithms. Therefore, metabolites that were important in all three algorithms may be considered to be biologically relevant.
Identification of unknown substances from non-targeted analyses that are important for the prediction of sex or age was performed by comparison with databases, as described in Weinert et al. and Egert et al. [29, 37] for GC × GC–MS or with the Chenomx NMR Suite 8.1 (Chenomx, Edmonton, Canada) for NMR. Statistical analyses were performed with SAS (version 9.4, SAS Institute, Cary, NC, USA) and the statistical software ‘R’ (version 3.2.2) with the R package ‘caret’, version 6.0-71.