We can evaluate these methods of TDC saturation correction by assessing how well their corrected values adhere to those that would be expected from the theoretical isotope patterns of a known compound. More direct methods of evaluation are difficult for real data since the true ion count is unknown. We will only illustrate correction methods (2) and (3) on real data, as the shape of the chromatographic peaks is not understood adequately that we can use method (1) reliably. The mass peak will be modeled as Gaussian although this functional form is only an approximation of the true shape at high ion counts.
When working with LC/MS data, the performances of methods (2) and (3) can be evaluated by applying them to the mass peaks of a single compound that induces a very strong signal. This is because methods (2) and (3) operate on the mass peak observed over individual chromatographic scans, and for strong signals these mass peaks will typically range from very small, near the edges of the chromatographic peak, to very large, near its zenith. Therefore, the accuracy of the correction methods can be examined over the full range of intensities that are likely to be encountered under standard experimental settings with such data. Since it is primarily the intensities of the mass peaks rather than the identities of the compounds inducing them that affect the performance of the corrections, applying the correction methods to additional compounds would yield little additional information. The signal induced by salicylic acid was identified as being amongst the strongest present in an LC/TOFMS data set derived from a sample of synthetic urine as part of an experiment described elsewhere . Therefore, the two lowest-mass isotopologues of salicylic acid (shown on the heatmap in the Supplementary Information Figure 1) were used to validate methods (2) and (3).
The sum of ion counts observed across the mass peaks of each of the two isotopologues are plotted against each other for matching chromatographic scans on Figure 1 along with the theoretical isotope ratio for salicylic acid. The raw ion counts adhere to the predicted isotope ratio for low intensities, but deviations become increasingly severe as the ion count of the monoisotope approaches the number of TOF acquisitions per chromatographic scan, which corresponds to full saturation. The estimated intensities provided by correction methods (2) and (3) are also shown. For low ion counts, these are only marginally greater than the raw ion counts; however, they largely restore the correct isotope ratio for monoisotopic intensities of up to around 4000 for correction method (3) and 5000 for correction method (2). Although there are substantial deviations from the true isotope ratio at the most heavily saturated scans, these two correction methods clearly provide a strong improvement over the isotope ratio suggested by the raw ion counts and, thereby, provide further support for the validity of the basic model.
The discrepancies from the true isotope ratios at high intensities are due to the heavy non-Gaussian tails of the largest mass peaks, which exceed the duration of the detector dead time as is explained in the experimental section of the Supplementary Information. However, for instruments for which the dead time exceeds the mass peak width and for which the mass and chromatographic peak shapes conform to known mathematical functions, a much better effective dynamic range may be attained by the very same corrections. Although we are only able to illustrate this with simulated data, they are derived from the basic model, which we believe provides the closest approximation to the true distribution of raw TOFMS data that has been published to date. Moreover, we know what its limitations are and it is likely possible to address them with engineering solutions. The following simulations, therefore, demonstrate potential improvements that may be within reach if instruments are devised whose data can be modeled more accurately.
The plots on Figure 2 show the true and the observed ion counts of a simulated peak in the chromatographic and m/z dimensions, along with the results of all three correction methods. Despite the very heavy saturation, correction methods (1) and (2) provide good estimates of the true rate of ion arrivals. It is to be expected that correction method (1) would perform best as it can reliably synthesize observations from multiple chromatographic scans with knowledge of the general variation in the chromatographic dimension. Correction method (3) performs well at the low-mass sides of mass peaks, where a large number of the TOF acquisitions are capable of registering ions, but poorly at the high-mass ends, where most of these are unavailable due to dead time.
The more general performances of correction methods (2) and (3) depend on numerous factors, including the TDC time resolution, the width of the mass peaks, and whether or not the latter quantity is known in advance or must be estimated from the data. Correction method (2) generally provides a modest improvement over method (3). For very high intensities, the latter will reach a plateau (see Figure 2), whereas the former will exhibit very high variance. However, correction method (1) provides highly accurate estimates for all realistic settings that we have examined. Even for peaks of 107 ions (100 times larger than the one shown on Figure 2), an excellent fit is obtained, amounting to an enhancement in effective dynamic range of over four orders of magnitude. Such improvements may be compared with those achieved via (potentially costly) engineering solutions, which in  enhance the detection efficiency by a factor of around 2.5 and in  increase the dynamic range by about one order of magnitude. We therefore believe that a strong case can be made for devoting further efforts to addressing the problem of TDC saturation via statistical corrections.