Data-dependent normalization strategies for untargeted metabolomics—a case study


Despite the recent advances in the standardization of untargeted metabolomics workflows, there is still a lack of attention to specific data treatment strategies that require deep knowledge of the biological problem and need to be applied after a well-thought out process to understand the effect of the practice. One of those strategies is data normalization. Data-driven assumptions are critical especially addressing unwanted variation present in the biological model as it can be the case in heterogeneous tissues, cells with different sizes or biofluids with different concentrations. Chronic kidney disease (CKD) is a widespread disorder affecting kidney structure and function. Animal models are being developed to be able to get valuable insights into the etiopathogenesis of the condition and effect of the treatments. Moreover, diagnosis and disease staging still require defining appropriate biomarkers. Untargeted metabolomics has the potential to deal with those challenges. Renal fibrosis is one of the consequences of kidney injury which greatly affects the concentration of metabolites in the same quantity of sample. To overcome this challenge, several data normalization strategies have been applied, following a multilevel normalization method with the overall aim of focussing on the relevant biological information and reducing the influence of disturbing factors. A comprehensive evaluation of the performance of the normalization strategies, both on methods assessing the intragroup variation and on the impact on differential analysis, is provided. Finally, we present evidence of the importance of biological-model-driven guided normalization methods and discuss multiple criteria that need to be taken into consideration to obtain robust and reliable data. Special concern is transmitted on the misleading conclusions that might be the consequence of inappropriate data pre-treatment solutions applied for untargeted methods.

Graphical abstract

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7



Bicinchoninic acid assay


Background electrolyte


Capillary electrophoresis–mass spectrometry


Chronic kidney disease


Control genetically modified with the FAO gain-of-function group


Control wild-type group


Extracellular matrix


Electrospray ionization


Fatty acid oxidation


Median fold change


Hierarchical cluster analysis


Internal standard


Jack-knifing uncertainty measures


Obstruction genetically modified with FAO gain-of-function group


Obstruction wild-type group


Orthogonal partial least squares discriminant analysis


Phosphate-buffered saline


Principal component analysis


Partial least squares-discriminant analysis


Probabilistic quotient normalization


Quality control


Quality control samples and support vector regression correction


Relative log abundance


Relative log expression


Relative standard deviation


Time of flight


Total useful signal


Unilateral ureteral obstruction


Variable importance in projection


Wild type


Complete data matrix, all samples from experimental groups, QC samples included


All samples from experimental groups, QC samples excluded


Matrix associated only with QC samples


Matrix divided into two groups, (1) control group: CTWT and CTMOD; (2) obstruction group: OBSWT and OBSMOD


Matrix divided into four groups: (1) CTWT; (2) CTMOD; (3) OBSWT; (4) OBSMOD


  1. 1.

    Gagnebin Y, Boccard J, Ponte B, Rudaz S. Metabolomics in chronic kidney disease: strategies for extended metabolome coverage. J Pharm Biomed Anal. 2018;161:313–25.

    CAS  Article  PubMed  Google Scholar 

  2. 2.

    Perales-Quintana MM, Saucedo AL, Lucio-Gutiérrez JR, Waksman N, Alarcon-Galvan G, Govea-Torres G, et al. Metabolomic and biochemical characterization of a new model of the transition of acute kidney injury to chronic kidney disease induced by folic acid. PeerJ. 2019;7:1–26.

    Article  Google Scholar 

  3. 3.

    Kimura T, Yasuda K, Yamamoto R, Soga T, Rakugi H, Hayashi T, et al. Identification of biomarkers for development of end-stage kidney disease in chronic kidney disease by metabolomic profiling. Sci Rep. 2016.

  4. 4.

    Kordalewska M, Macioszek S, Wawrzyniak R, Sikorska-Wiśniewska M, Śledziński T, Chmielewski M, et al. Multiplatform metabolomics provides insight into the molecular basis of chronic kidney disease. J Chromatogr B Anal Technol Biomed Life Sci. 2019;1117:49–57.

    CAS  Article  Google Scholar 

  5. 5.

    Zhang ZH, He JQ, Qin WW, Zhao YY, Tan NH. Biomarkers of obstructive nephropathy using a metabolomics approach in rat. Chem Biol Interact. 2018:229–39.

  6. 6.

    Zhao Y-Y, Chen H, Tian T, Chen D-Q, Ba X, Wei F. A pharmaco-metabonomic study on chronic kidney disease and therapeutic effect of Ergone by UPLC-QTOF/HDMS. PLoS One. 2014;9:1–18.

    CAS  Article  Google Scholar 

  7. 7.

    Dudzik D, Barbas-Bernardos C, García A, Barbas C. Quality assurance procedures for mass spectrometry untargeted metabolomics. A review. J Pharm Biomed Anal. 2017;147:149–73.

    CAS  Article  PubMed  Google Scholar 

  8. 8.

    De Livera AM, Olshansky G, Simpson JA, Creek DJ. NormalizeMets: assessing, selecting and implementing statistical methods for normalizing metabolomics data. Metabolomics. 2018;14.

  9. 9.

    Chen J, Zhang P, Lv M, Guo H, Huang Y, Zhang Z, et al. Influences of normalization method on biomarker discovery in gas chromatography-mass spectrometry-based untargeted metabolomics: what should be considered? Anal Chem. 2017;89:5342–8.

    CAS  Article  PubMed  Google Scholar 

  10. 10.

    Li B, Tang J, Yang Q, Cui X, Li S, Chen S, et al. Performance evaluation and online realization of data-driven normalization methods used in LC/MS based untargeted metabolomics analysis. Sci Rep. 2016:6.

  11. 11.

    De Livera AM, Sysi-Aho M, Jacob L, Gagnon-Bartsch JA, Castillo S, Simpson JA, et al. Statistical methods for handling unwanted variation in metabolomics data. Anal Chem. 2015;87:3606–15.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Sánchez-Illana Á, Pérez-Guaita D, Cuesta-García D, Sanjuan-Herráez JD, Vento M, Ruiz-Cerdá JL, et al. Model selection for within-batch effect correction in UPLC-MS metabolomics using quality control - support vector regression. Anal Chim Acta. 2018;1026:62–8.

    CAS  Article  PubMed  Google Scholar 

  13. 13.

    Thonusin C, IglayReger HB, Soni T, Rothberg AE, Burant CF, Evans CR. Evaluation of intensity drift correction strategies using MetaboDrift, a normalization tool for multi-batch metabolomics data. J Chromatogr A. 2017;1523:265–74.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  14. 14.

    Gagnebin Y, Tonoli D, Lescuyer P, Ponte B, de Seigneux S, Martin PY, et al. Metabolomic analysis of urine samples by UHPLC-QTOF-MS: impact of normalization strategies. Anal Chim Acta. 2017;955:27–35.

    CAS  Article  PubMed  Google Scholar 

  15. 15.

    Veselkov KA, Vingara LK, Masson P, Robinette SL, Want E, Li JV, et al. Optimized preprocessing of ultra-performance liquid chromatography/mass spectrometry urinary metabolic profiles for improved information recovery. Anal Chem. 2011;83:5864–72.

    CAS  Article  PubMed  Google Scholar 

  16. 16.

    De Livera AM, Dias DA, De Souza D, Rupasinghe T, Pyke J, Tull D, et al. Normalizing and integrating metabolomics data. Anal Chem. 2012;84:10768–76.

    CAS  Article  PubMed  Google Scholar 

  17. 17.

    DIRECTIVE 2010/63/EU of the European Parliament and of the Council of 22 September 2010 on the protection of animals used for scientific purposes. In: Off. J. Eur. Union.

  18. 18.

    Lavoz C, Alique M, Rodrigues-Diez R, Pato J, Keri G, Mezzano S, et al. Gremlin regulates renal inflammation via the vascular endothelial growth factor receptor 2 pathway. J Pathol. 2015;236:407–20.

    CAS  Article  PubMed  Google Scholar 

  19. 19.

    Chevalier RL, Forbes MS, Thornhill BA. Ureteral obstruction as a model of renal interstitial fibrosis and obstructive nephropathy. Kidney Int. 2009;75:1145–52.

    Article  Google Scholar 

  20. 20.

    Naz S, García A, Barbas C. Multiplatform analytical methodology for metabolic fingerprinting of lung tissue. Anal Chem. 2013.

  21. 21.

    González-Peña D, Dudzik D, García A, Ancos B, Barbas C, Sánchez-Moreno C. Metabolomic fingerprinting in the comprehensive study of liver changes associated with onion supplementation in hypercholesterolemic Wistar rats. Int J Mol Sci. 2017;18:267.

    CAS  Article  PubMed Central  Google Scholar 

  22. 22.

    Kuligowski J, Sánchez-Illana Á, Sanjuán-Herráez D, Vento M, Quintás G. Intra-batch effect correction in liquid chromatography-mass spectrometry using quality control samples and support vector regression (QC-SVRC). Analyst. 2015;140:7810–7.

    CAS  Article  PubMed  Google Scholar 

  23. 23.

    Gil-de-la-Fuente A, Godzien J, Saugar S, Garcia-Carmona R, Badran H, Wishart DS, et al. CEU Mass Mediator 3.0: a metabolite annotation tool. J Proteome Res. 2019;18:797–802.

    CAS  Article  PubMed  Google Scholar 

  24. 24.

    Silva AM, Cordeiro-da-Silva A, Coombs GH. Metabolic variation during development in culture of Leishmania donovani promastigotes. PLoS Negl Trop Dis. 2011;5.

  25. 25.

    Warrack BM, Hnatyshyn S, Ott K-H, Reily MD, Sanders M, Zhang H, et al. Normalization strategies for metabonomic analysis of urine samples. J Chromatogr B. 2009;877:547–52.

    CAS  Article  Google Scholar 

  26. 26.

    Sysi-Aho M, Katajamaa M, Yetukuri L, Orešič M. Normalization method for metabolomics data using optimal selection of multiple internal standards. BMC Bioinformatics. 2007;8.

  27. 27.

    Dieterle F, Ross A, Schlotterbeck G, Senn H. Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in1H NMR metabonomics. Anal Chem. 2006;78:4281–90.

    CAS  Article  PubMed  Google Scholar 

  28. 28.

    Lee J, Park J, Lim M, Seong SJ, Seo JJ, Park SM, et al. Quantile normalization approach for liquid chromatography-mass spectrometry-based metabolomic data from healthy human volunteers. Anal Sci. 2012;28:801–5.

    CAS  Article  PubMed  Google Scholar 

  29. 29.

    Wu Y, Li L. Sample normalization methods in quantitative metabolomics. J Chromatogr A. 2015;1430:80–95.

    Article  Google Scholar 

  30. 30.

    Patterson AD, Li H, Eichler GS, Krausz KW, Weinstein JN, Fornace AJ, et al. UPC-ESI-TOFMS-based metabolomics and gene expression dynamics inspector self-organizing metabolomic maps as tools for understanding the cellular response to ionizing radiation. Anal Chem. 2008;80:665–74.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Kapoore RV, Coyle R, Staton CA, Brown NJ, Vaidyanathan S. Influence of washing and quenching in profiling the metabolome of adherent mammalian cells: a case study with the metastatic breast cancer cell line MDA-MB-231. Analyst. 2017;142:2038–49.

    CAS  Article  PubMed  Google Scholar 

  32. 32.

    Silva LP, Lorenzi PL, Purwaha P, Yong V, Hawke DH, Weinstein JN. Measurement of DNA concentration as a normalization strategy for metabolomic data from adherent cell lines. Anal Chem. 2013.

  33. 33.

    Noonan MJ, Tinnesand HV, Buesching CD. Normalizing gas-chromatography–mass spectrometry data: method choice can alter biological inference. BioEssays. 2018;40.

  34. 34.

    Li B, Tang J, Yang Q, Li S, Cui X, Li Y, et al. NOREVA: normalization and evaluation of MS-based metabolomics data. Nucleic Acids Res. 2017;45:W162–70.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Cook T, Ma Y, Gamagedara S. Evaluation of statistical techniques to normalize mass spectrometry-based urinary metabolomics data. J Pharm Biomed Anal. 2020:177.

  36. 36.

    Ejigu BA, Valkenborg D, Baggerman G, Vanaerschot M, Witters E, Dujardin J-C, et al. Evaluation of normalization methods to pave the way towards large-scale LC-MS-based metabolomics profiling experiments. Omi A J Integr Biol. 2013;17:473–85.

    CAS  Article  Google Scholar 

  37. 37.

    Parsons HM, Ekman DR, Collette TW, Viant MR. Spectral relative standard deviation: a practical benchmark in metabolomics. Analyst. 2009;134:478–85.

    CAS  Article  PubMed  Google Scholar 

  38. 38.

    Wang YN, Ma SX, Chen YY, Chen L, Liu BL, Liu QQ, et al. Chronic kidney disease: biomarker diagnosis to therapeutic targets. Clin Chim Acta. 2019;499:54–63.

    CAS  Article  Google Scholar 

  39. 39.

    Nogueira A, Pires MJ, Oliveira PA. Pathophysiological mechanisms of renal fibrosis: a review of animal models and therapeutic strategies. In Vivo (Brooklyn). 2017;31:1–22.

    CAS  Article  Google Scholar 

  40. 40.

    Gandolfo LC, Speed TP. RLE plots: visualizing unwanted variation in high dimensional data. PLoS One. 2018;13:1–9.

    CAS  Article  Google Scholar 

  41. 41.

    Walach J, Filzmoser P, Hron K, Walczak B, Najdekr L. Robust biomarker identification in a two-class problem based on pairwise log-ratios. Chemom Intell Lab Syst. 2017;171:277–85.

    CAS  Article  Google Scholar 

  42. 42.

    Filzmoser P, Walczak B. What can go wrong at the data normalization step for identification of biomarkers? J Chromatogr A. 2014;1362:194–205.

    CAS  Article  PubMed  Google Scholar 

  43. 43.

    Paulson JN, Chen CY, Lopes-Ramos CM, Kuijjer ML, Platig J, Sonawane AR, et al. Tissue-aware RNA-Seq processing and normalization for heterogeneous and sparse data. BMC Bioinformatics. 2017;18:1–10.

    CAS  Article  Google Scholar 

  44. 44.

    Hicks SC, Okrah K, Paulson JN, Quackenbush J, Irizarry RA, Bravo HC. Smooth quantile normalization. Biostatistics. 2018;19:185–98.

    Article  PubMed  Google Scholar 

Download references


This work was supported by Comunidad de Madrid (B-2017/BMD-3751 “NOVELREN-CM”), Ministerio de Ciencia, Innovación y Universidades (RTI 2018-095166-B-100) and Ministerio de Economía y Competitividad (MINECO) SAF2015-66107-R (SL), cofunded by the European Regional Development Fund and Instituto de Salud Carlos III REDinREN RD12/0021/0009 and RD16/0009/0016 (SL).

Author information



Corresponding author

Correspondence to Coral Barbas.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Published in the topical collection featuring Female Role Models in Analytical Chemistry.

Electronic supplementary material


(PDF 1.43 mb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Cuevas-Delgado, P., Dudzik, D., Miguel, V. et al. Data-dependent normalization strategies for untargeted metabolomics—a case study. Anal Bioanal Chem 412, 6391–6405 (2020).

Download citation


  • Unwanted variation
  • Data pre-treatment
  • Normalization
  • Tissue samples
  • Capillary electrophoresis mass spectrometry
  • Biomarker discovery