# Comment on: “An Efficient Method to Calculate the Aggregated Isotopic Distribution and Exact Center-Masses” by Jürgen Claesen, Piotr Dittwald, Tomasz Burzykowski, Dirk Valkenborg, *J. Am. Soc. Mass Spectrom.* 2012, **23**, 753–763

- 659 Downloads
- 5 Citations

## Key words

Computational method Isotopic distribution Isotope pattern Accurate mass Mass spectrometry Proteomics MetabolomicsClaesen et al*.* [1] recently presented an efficient method for computing the isotope pattern of a molecule, that is, both the isotope distribution and the center masses (also called “probability-weighted masses“ and “aggregated isotopic variants“ in [1]) of the isotope peaks. They favorably compare their approach against five other methods for simulating isotope patterns: Their tool BRAIN is more accurate than any other tool, and computation times are well below 1 s even for huge molecules of mass above 533403 Da and when computing 1325 peak masses.

Unfortunately, the authors fail to mention SIRIUS [2] that is capable of performing such calculations, too. The user interface of SIRIUS does not offer to input a molecular formula, but the method is accessible through the source code freely available as open source, http://bio.informatik.uni-jena.de/Sirius. The mathematical details of the method were introduced in 2006 [3]. SIRIUS computations are based on the convolution of isotope distributions: In [3] it is proven that center masses, called “mean peak masses“ in [2, 3], can also be computed via such convolutions. The same approach was independently, and somewhat informally, suggested in 2006 by Rockwood and Haimi [4]. Combining this with a smart Russian multiplication scheme allows us to quickly determine the isotope pattern of every element, which are then convoluted to determine the final isotope pattern of the molecule. The methods implemented in SIRIUS, including the decomposition of monoisotopic masses, are also available via the Bioconductor package “Rdisop“ written by Pervukhin and Neumann, http://bioconductor.org/packages/release/bioc/html/Rdisop.html

Molecular Formulas, Average Masses Computed by SIRIUS, and Running Time of SIRIUS and BRAIN. Average Mass and Mass Delta in Dalton. “No. Peaks” is the Number of Center Masses (Mean Peak Masses) Computed by the two Methods. Running Times in Milliseconds. Running Times for BRAIN Taken from [1]. “R-BRAIN” is the Running Time of the R Implementation of BRAIN

Average mass | No. | Running time (ms) | |||||
---|---|---|---|---|---|---|---|

No. | Molecular formula | SIRIUS | Theoretical | peaks | SIRIUS | BRAIN | R-BRAIN |

(1) | C50H71N13O12 | 1046.181107 | 1046.181107 | 50 | 5.9 | 37.5 | 18.2 |

(2) | C254H377N65O75S6 | 5733.510759 | 5733.510759 | 50 | 9.4 | 37.0 | 17.1 |

(3) | C520H817N139O147S8 | 11624.448751 | 11624.448751 | 50 | 14.8 | 37.6 | 17.7 |

(4) | C744H1224N210O222S5 | 16823.321352 | 16823.321352 | 100 | 11.3 | 37.0 | 32.6 |

(5) | C2023H3208N524O619S20 | 45415.679370 | 45415.679370 | 322 | 41.4 | 72.3 | 114.5 |

(6) | C2934H4615N781O897S39 | 66432.455560 | 66432.455560 | 400 | 62.7 | 75.4 | 146.8 |

(7) | C5047H8014N1338O1495S48 | 112895.125932 | 112895.125932 | 643 | 125.1 | 156.0 | 280.9 |

(8) | C8574H13378N2092O2392S77 | 186506.052593 | 186506.052593 | 807 | 164.0 | 216.8 | 388.5 |

(9) | C17600H26474N4752O5486S197 | 398722.972482 | 398722.972482 | 1163 | 312.4 | 355.7 | 661.1 |

(10) | C23832H37816N6528O7031S170 | 533735.214649 | 533735.214649 | 1325 | 400.7 | 408.6 | 791.6 |

The accuracy of SIRIUS is practically identical to that of BRAIN: Since SIRIUS also uses exact monoisotopic masses, the mass difference between calculated and theoretical monoisotopic peak is zero, compare to Table 3 in [1]. For the average mass, the mass computed by SIRIUS (by taking the weighed sum over all masses of the isotope pattern) and the theoretical average mass are again identical, see Table 1 and compare to Table 4 in [1]. Finally, we also compare the running times of SIRIUS and BRAIN: We report running times from [1] (Table 5) where BRAIN is implemented in Matlab and run on a Intel Core 2 Duo processor with 2.26 GHz and 4 GB RAM. SIRIUS was run on a MacBook Pro with Intel Core 2 Duo processor at 2.66 GHz and 4 GB RAM, using the Java virtual machine version 1.6.0. One can see that running times are very similar. We also evaluated the R implementation of BRAIN (again on the MacBook Pro) that is available as a Bioconductor package, http://bioconductor.org/packages/devel/bioc/html/BRAIN.html.

Masses in Table 1 have been rounded to six decimal places, and it appears that this is also true for all tables in [1]. In fact, there is a slight mass error for the average mass, that was well below 0.002 ppb (parts per billion) for all 10 molecules. We stress that a certain error is inevitable when computations are carried out using machine numbers, due to rounding error accumulation. When even higher mass accuracies are needed, other data types such as the BigDecimal type in Java can be used to reach an even higher accuracy, at the expense of increased running times. But this appears to be a wasteful undertaking, given that masses and, in particular, abundances of isotopes are known only with a rather limited precision.

The convolution method implemented in SIRIUS [2, 3] is easy to understand and straightforward to implement. Also, this method is very fast when calculations are limited to only few (say, 10) peaks. This is important when many isotope patterns have to be simulated, for example in the SIRIUS pipeline where for each decomposition of the monoisotopic mass, an isotope pattern is simulated and compared against the measured isotope pattern [2]. In this way, SIRIUS can decompose a monoisotopic mass, simulate isotope patterns for about 1000 molecular formulas, and match them against the measured pattern in less than a second [2]. On the other hand, the mathematically more involved method of Claesen et al. possibly is faster for very large molecules such as the human dynein heavy chain. We note that SIRIUS is implemented in Java and, hence, runtime-compiled into Java bytecode, whereas BRAIN is implemented in R and, hence, interpreted. To this end, it is likely that a constant-factor improvement in running time can be reached, implementing BRAIN in a compiled language. On the other hand, SIRIUS has not been designed to compute isotope patterns of molecules this large, so it is likely that running times can be further improved if this is required.

In full, it seems to be up to the user’s preferences, which method to choose, as both methods reach the same high accuracy and running times are very similar. On the other hand, BRAIN and, hence, also SIRIUS outperform all other methods evaluated in [1] (namely, Emass, Mercury, NeutronCluster, IsoPro, and IsoDalton) with respect to accuracy and sometimes even running time, see [1] for details.

## Notes

### Acknowledgments

All computations were carried out by Franziska Hufsky.

## References

- 1.Claesen, J., Dittwald, P., Burzykowski, T., Valkenborg, D.: An efficient method to calculate the aggregated isotopic distribution and exact center-masses.
*J. Am. Soc. Mass Spectrom.***23**, 753–763 (2012)CrossRefGoogle Scholar - 2.Böcker, S., Letzel, M., Lipták, Zs., Pervukhin, A.: SIRIUS: decomposing isotope patterns for metabolite identification.
*Bioinformatics***25**(2), 218–224 (2009)CrossRefGoogle Scholar - 3.Böcker, S., Letzel, M., Lipták, Zs., Pervukhin, A.: Decomposing metabolomic isotope patterns. In: Proc. of Workshop on Algorithms in Bioinformatics (WABI 2006), vol. 4175 of Lect. Notes Comput. Sci, pp. 12–23. Springer, Berlin (2006)Google Scholar
- 4.Rockwood, A.L., Haimi, P.: Efficient calculation of accurate masses of isotopic peaks.
*J. Am. Soc. Mass Spectrom.***17**(3), 415–419 (2006)CrossRefGoogle Scholar - 5.Audi, G., Wapstra, A., Thibault, C.: The AME2003 atomic mass evaluation (II): tables, graphs, and references.
*Nucl. Phys. A***729**, 129–336 (2003)CrossRefGoogle Scholar - 6.Wieser, M.E.: Atomic weights of the elements 2005 (IUPAC Technical Report).
*Pure Appl. Chem.***78**(11), 2051–2066 (2006)CrossRefGoogle Scholar - 7.de Laeter, J.R., Böhlke, J.K., Biévre, P.D., Hidaka, H., Peiser, H.S., Rosman, K.J.R., Taylor, P.D.P.: Atomic weights of the elements. Review 2000 (IUPAC Technical Report).
*Pure Appl. Chem.***75**(6), 683–800 (2003)CrossRefGoogle Scholar - 8.Rosman, K., Taylor, P.: Isotopic compositions of the elements 1997.
*Pure Appl. Chem.***70**(1), 217–235 (1998)CrossRefGoogle Scholar - 9.Olson, M.T., Yergey, A.L.: Calculation of the isotope cluster for polypeptides by probability grouping.
*J. Am. Soc. Mass Spectrom.***20**(2), 295–302 (2009)CrossRefGoogle Scholar