Progressive Calibration and Averaging for Tandem Mass Spectrometry Statistical Confidence Estimation: Why Settle for a Single Decoy?
Estimating the false discovery rate (FDR) among a list of tandem mass spectrum identifications is mostly done through target-decoy competition (TDC). Here we offer two new methods that can use an arbitrarily small number of additional randomly drawn decoy databases to improve TDC. Specifically, “Partial Calibration” utilizes a new meta-scoring scheme that allows us to gradually benefit from the increase in the number of identifications calibration yields and “Averaged TDC” (a-TDC) reduces the liberal bias of TDC for small FDR values and its variability throughout. Combining a-TDC with “Progressive Calibration” (PC), which attempts to find the “right” number of decoys required for calibration we see substantial impact in real datasets: when analyzing the Plasmodium falciparum data it typically yields almost the entire 17% increase in discoveries that “full calibration” yields (at FDR level 0.05) using 60 times fewer decoys. Our methods are further validated using a novel realistic simulation scheme and importantly, they apply more generally to the problem of controlling the FDR among discoveries from searching an incomplete database.
KeywordsTandem mass spectrometry Spectrum identification False discovery rate Calibration
- 4.Cerqueira, F.R., Graber, A., Schwikowski, B., Baumgartner, C.: Mude: a new approach for optimizing sensitivity in the target-decoy search strategy for large-scale peptide/protein identification. J. Proteome Res. 9(5), 2265–2277 (2010). pMID: 20199108. http://dx.doi.org/10.1021/pr901023v CrossRefGoogle Scholar
- 13.Keich, U., Noble, W.S.: Controlling the FDR in imperfect matches to an incomplete database (2016, submitted)Google Scholar