Skip to main content

Unveiling the Links Between Peptide Identification and Differential Analysis FDR Controls by Means of a Practical Introduction to Knockoff Filters

  • Protocol
  • First Online:
Statistical Analysis of Proteomic Data

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2426))

Abstract

In proteomic differential analysis, FDR control is often performed through a multiple test correction (i.e., the adjustment of the original p-values). In this protocol, we apply a recent and alternative method, based on so-called knockoff filters. It shares interesting conceptual similarities with the target–decoy competition procedure, classically used in proteomics for FDR control at peptide identification. To provide practitioners with a unified understanding of FDR control in proteomics, we apply the knockoff procedure on real and simulated quantitative datasets. Leveraging these comparisons, we propose to adapt the knockoff procedure to better fit the specificities of quantitative proteomic data (mainly very few samples). Performances of knockoff procedure are compared with those of the classical Benjamini–Hochberg procedure, hereby shedding a new light on the strengths and weaknesses of target–decoy competition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodological) 57(1):289–300. http://www.jstor.org/stable/2346101

    Google Scholar 

  2. Benjamini Y, Krieger AM, Yekutieli D (2006) Adaptive linear step-up procedures that control the false discovery rate. Biometrika 93(3):491–507. https://doi.org/10.1093/biomet/93.3.491

    Article  Google Scholar 

  3. Efron B (2012) Large-scale inference: empirical Bayes methods for estimation, testing, and prediction, vol 1. Cambridge University Press, Cambridge

    Google Scholar 

  4. Barber RF, Candès EJ, et al (2015) Controlling the false discovery rate via knockoffs. Ann Stat 43(5):2055–2085. https://doi.org/10.1214/15-AOS1337

    Article  Google Scholar 

  5. Candès E, Fan Y, Janson L, Lv J (2018) Panning for gold: ‘model-x’ knockoffs for high dimensional controlled variable selection. J. R Stat Soc: Ser B (Stat Methodol) 80(3):551–577. https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/rssb.12265

    Article  Google Scholar 

  6. Stephens M (2017) False discovery rates: a new deal. Biostatistics 18(2):275–294. https://doi.org/10.1093/biostatistics/kxw041

    PubMed  Google Scholar 

  7. Elias JE, Gygi SP (2007) Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods 4(3):207–214, https://doi.org/10.1038/nmeth1019

    Article  CAS  PubMed  Google Scholar 

  8. Käll L, Storey JD, MacCoss MJ, Noble WS (2008) Assigning significance to peptides identified by tandem mass spectrometry using decoy databases. J Proteome Res 7(01):29–34. https://doi.org/10.1021/pr700600n

    Article  PubMed  Google Scholar 

  9. Couté Y, Bruley C, Burger T (2020) Beyond target-decoy competition: stable validation of peptide and protein identifications in mass spectrometry-based discovery proteomics. Anal Chem 92(22):14898–14906. https://doi.org/10.1021/acs.analchem.0c00328

    Article  PubMed  Google Scholar 

  10. Emery K, Hasam S, Noble WS, Keich U (2019) Multiple competition-based FDR control for peptide detection. Preprint. https://arxiv.org/abs/1907.01458

  11. He K, Fu Y, Zeng WF, Luo L, Chi H, Liu C, Qing LY, Sun RX, He SM (2015) A theoretical foundation of the target-decoy search strategy for false discovery rate control in proteomics. Preprint. https://arxiv.org/abs/1501.00537

  12. Bouret P, Bastien F (2018) Erreurs et tests statistiques (40 min). https://hal.inria.fr/medihal-01774420/

  13. Burger T (2018) Gentle introduction to the statistical foundations of false discovery rate in quantitative proteomics. J Proteome Res 17(1):12–22. https://doi.org/10.1021/acs.jproteome.7b00170

    Article  CAS  PubMed  Google Scholar 

  14. Hastie T, Efron B (2013) LARS: Least Angle Regression, Lasso and Forward Stagewise. R package version 1.2. https://CRAN.R-project.org/package=lars

    Google Scholar 

  15. Friedman J, Hastie J, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22. http://www.jstatsoft.org/v33/i01/

    Article  PubMed  PubMed Central  Google Scholar 

  16. Giai-Gianetto Q, Combes F, Ramus C, Bruley C, Couté Y, Burger T (2019) cp4p: calibration plot for proteomics. R package version 0.3.6. https://CRAN.R-project.org/package=cp4p

  17. Ramus C, Hovasse A, Marcellin M, Hesse AM, Mouton-Barbosa E, Bouyssié D, Vaca S, Carapito C, Chaoui K, Bruley C, Garin J, Cianférani S, Ferro M, Van Dorssaeler A, Burlet-Schiltz O, Schaeffer C, Couté Y, Gonzalez de Peredo A (2016) Benchmarking quantitative label-free LC–MS data processing workflows using a complex spiked proteomic standard dataset. J Proteom 132:51–62. https://www.sciencedirect.com/science/article/pii/S187439191530186X

    Article  CAS  Google Scholar 

  18. Huber W, Carey VJ, Gentleman R, Anders S, Carlson M, Carvalho BS, Bravo HC, Davis S, Gatto L, Girke T, Gottardo R, Hahne F, Hansen KD, Irizarry RA, Lawrence M, Love MI, MacDonald J, Obenchain V, Ole’s AK, Pag‘es H, Reyes A, Shannon P, Smyth GK, Tenenbaum D, Waldron L, Morgan M (2015) Orchestrating high-throughput genomic analysis with Bioconductor. Nat Methods 12(2):115–121. http://www.nature.com/nmeth/journal/v12/n2/full/nmeth.3252.html

  19. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc: Ser B (Stat Methodol) 67(2):301–320. https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.1467-9868.2005.00503.x

    Article  Google Scholar 

  20. Nguyen TB, Chevalier JA, Thirion B, Arlot S (2020) Aggregation of multiple knockoffs. In: International conference on machine learning, PMLR, pp 7283–7293. http://proceedings.mlr.press/v119/nguyen20a.html

  21. Keich U, Tamura K, Noble WS (2019) Averaging strategy to reduce variability in target-decoy estimates of false discovery rate. J Proteome Res 18(2):585–593. https://doi.org/10.1021/acs.jproteome.8b00802

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Romano JP, Shaikh AM, et al. (2006) On stepdown control of the false discovery proportion. In: Optimality, Institute of Mathematical Statistics, pp 33–50

    Google Scholar 

  23. Luo D, He Y, Emery K, Noble WS, Keich U (2020) Competition-based control of the false discovery proportion. Preprint. https://arxiv.org/abs/2011.11939

  24. Ge Y, Dudoit S, Speed TP (2003) Resampling-based multiple testing for microarray data analysis. Test 12(1):1–77. https://doi.org/10.1007/BF02595811

    Article  Google Scholar 

  25. Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32(2):407–499. https://doi.org/10.1214/009053604000000067

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by grants from the French National Research Agency: ProFI project (ANR-10-INBS-08), GRAL project (ANR-10-LABX-49-01), DATA@UGA and SYMER projects (ANR-15-IDEX-02) and MIAI @ Grenoble Alpes (ANR-19-P3IA-0003).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lucas Etourneau .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Etourneau, L., Varoquaux, N., Burger, T. (2023). Unveiling the Links Between Peptide Identification and Differential Analysis FDR Controls by Means of a Practical Introduction to Knockoff Filters. In: Burger, T. (eds) Statistical Analysis of Proteomic Data. Methods in Molecular Biology, vol 2426. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1967-4_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-1967-4_1

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-1966-7

  • Online ISBN: 978-1-0716-1967-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics